Google DeepMind Releases Perch 2.0 to Identify Wildlife from 1.5M+ Audio Recordings
Image Source: Google Deepmind
Google DeepMind has launched Perch 2.0, an enhanced open source artificial intelligence model that processes audio recordings to identify animal species and monitor ecosystem health, building on its 2023 predecessor with broader species coverage and improved transfer learning.
The update, announced in September 2025, expands the model's training to include mammals, amphibians and insects alongside birds, enabling conservationists to sift through vast datasets from rainforests, wetlands and other terrestrial environments more efficiently, with capabilities extending to underwater settings like coral reefs through transfer learning.
Background
Bioacoustics, the study of animal sounds to gauge biodiversity, has long relied on microphones and hydrophones to capture data from remote areas. However, the sheer volume of recordings, often running into millions of hours, has posed challenges for manual review by experts.
This field emerged as a non invasive tool for tracking endangered species, with early efforts focusing on birds due to their distinct vocalisations. The need for AI arose from the limitations of traditional methods, which struggle with subtle or overlapping sounds in noisy environments like underwater habitats.
Google DeepMind entered this space with the original Perch in 2023, initially trained on avian data to classify over 10,000 bird species. The push for an update stemmed from feedback from conservation groups, highlighting gaps in multi taxa analysis and the demand for tools adaptable to new settings without extensive retraining.
Development of Perch 2.0
Perch 2.0 was developed by a team at Google DeepMind, drawing on public datasets such as Xeno Canto and iNaturalist, compiled in early 2025. The model incorporates an EfficientNet B3 architecture with 12 million parameters, processing five second audio clips into embeddings for classification.
Training involved 1.54 million recordings across 14,795 classes, nearly double the data used for the first version. This included 1.37 million bird entries, 55,000 amphibian ones, 63,000 for insects and 15,000 mammal samples, with techniques like mixup augmentation to handle noisy labels.
Collaborations with organisations like the Cornell Lab of Ornithology and BirdLife Australia informed the refinements, ensuring the model accounts for anthropogenic noise and supports agile modelling, where new classifiers can be built from minimal examples in under an hour.
The open source release on Kaggle and GitHub allows global access, with code for inference and evaluation available under standard licences, though it requires specific setups like Python environments for use.
Key Capabilities
The model is highly effective at picking out animal calls from complex soundscapes, accurately identifying birds and other species like frogs directly, while marine animals like whales and fish are detected using custom classifiers or transfer learning. In one case study, it processed data up to 50 times faster than traditional methods.
Adaptation to underwater environments marks a key advance, outperforming specialised marine models in transfer tasks despite limited aquatic training data.
Real World Applications
In Australia, Perch tools aided BirdLife Australia and the Australian Acoustic Observatory in identifying a new population of the Plains Wanderer, a critically endangered bird, west of Melbourne the first sighting there in three decades.
At the University of Hawaii's LOHE Bioacoustics Lab, the model accelerated detection of honeycreeper calls, birds facing threats from avian malaria, allowing broader coverage of populations with cultural importance in local mythology.
For marine use, the related SurfPerch model supports coral reef health assessments by identifying fish and invertebrate sounds, reducing the need for divers in remote or deep sites through efficient analysis of passive audio data.
Impact on Conservation
Perch 2.0 enhances biodiversity monitoring by enabling scalable, low cost analysis that minimises invasive techniques like animal tagging. It frees resources for fieldwork, potentially aiding efforts to track population trends amid climate change and habitat loss.
With over 250,000 downloads of the original model, the update could broaden adoption among researchers, supporting global initiatives to protect ecosystems from rainforests in Peru to reefs in the Pacific.
Limitations include dataset imbalances favouring birds and exclusions like bat ultrasounds, which may affect performance in certain niches.
Future Trends
Looking ahead, advancements in bioacoustics AI point to semi supervised learning using unlabelled data and integration with other sensors for multimodal monitoring. Perch 2.0's emphasis on efficient, supervised models suggests a shift towards accessible tools for field biologists, potentially expanding to more taxa like reptiles.
As conservation faces growing pressures, such AI developments could play a pivotal role in informing policy and restoration projects worldwide.
Source: Google Deepmind, arXiv
We are a leading AI-focused digital news platform, combining AI-generated reporting with human editorial oversight. By aggregating and synthesizing the latest developments in AI — spanning innovation, technology, ethics, policy and business — we deliver timely, accurate and thought-provoking content.
