Tag Archives: bird song

Ever heard a bird singing and wondered what kind of bird?

The Cornell University Lab of Ornithology’s sound recognition feature in its Merlin birding app(lication) can answer that question for you according to a July 14, 2021 article by Steven Melendez for Fast Company (Note: Links have been removed),

The lab recently upgraded its Merlin smartphone app, designed for both new and experienced birdwatchers. It now features an AI-infused “Sound ID” feature that can capture bird sounds and compare them to crowdsourced samples to figure out just what bird is making that sound. … people have used it to identify more than 1 million birds. New user counts are also up 58% since the two weeks before launch, and up 44% over the same period last year, according to Drew Weber, Merlin’s project coordinator.

Even when it’s listening to bird sounds, the app still relies on recent advances in image recognition, says project research engineer Grant Van Horn. …, it actually transforms the sound into a visual graph called a spectrogram, similar to what you might see in an audio editing program. Then, it analyzes that spectrogram to look for similarities to known bird calls, which come from the Cornell Lab’s eBird citizen science project.

There’s more detail about Merlin in Marc Devokaitis’ June 23, 2021 article for the Cornell Chronicle,

… Merlin can recognize the sounds of more than 400 species from the U.S. and Canada, with that number set to expand rapidly in future updates.

As Merlin listens, it uses artificial intelligence (AI) technology to identify each species, displaying in real time a list and photos of the birds that are singing or calling.

Automatic song ID has been a dream for decades, but analyzing sound has always been extremely difficult. The breakthrough came when researchers, including Merlin lead researcher Grant Van Horn, began treating the sounds as images and applying new and powerful image classification algorithms like the ones that already power Merlin’s Photo ID feature.

“Each sound recording a user makes gets converted from a waveform to a spectrogram – a way to visualize the amplitude [volume], frequency [pitch] and duration of the sound,” Van Horn said. “So just like Merlin can identify a picture of a bird, it can now use this picture of a bird’s sound to make an ID.”

Merlin’s pioneering approach to sound identification is powered by tens of thousands of citizen scientists who contributed their bird observations and sound recordings to eBird, the Cornell Lab’s global database.

“Thousands of sound recordings train Merlin to recognize each bird species, and more than a billion bird observations in eBird tell Merlin which birds are likely to be present at a particular place and time,” said Drew Weber, Merlin project coordinator. “Having this incredibly robust bird dataset – and feeding that into faster and more powerful machine-learning tools – enables Merlin to identify birds by sound now, when doing so seemed like a daunting challenge just a few years ago.”

The Merlin Bird ID app with the new Sound ID feature is available for free on iOS and Android devices. Click here to download the Merlin Bird ID app and follow the prompts. If you already have Merlin installed on your phone, tap “Get Sound ID.”

Do take a look at Devokaitis’ June 23, 2021 article for more about how the Merlin app provides four ways to identify birds.

For anyone who likes to listen to the news, there’s an August 26, 2021 podcast (The Warblers by Birds Canada) featuring Drew Weber, Merlin project coordinator, and Jody Allair, Birds Canada Director of Community Engagement, discussing Merlin,

It’s a dream come true – there’s finally an app for identifying bird sounds. In the next episode of The Warblers podcast, we’ll explore the Merlin Bird ID app’s new Sound ID feature and how artificial intelligence is redefining birding. We talk with Drew Weber and Jody Allair and go deep into the implications and opportunities that this technology will bring for birds, and new as well as experienced birders.

The Warblers is hosted by Andrea Gress and Andrés Jiménez.

What human speech, jazz, and whale song have in common

Credit: iStock/Velvetfish

Seeing connections between what seem to be unrelated activities such as human speech, jazz, and whale song is fascinating to me and I’m not alone. Scientists at the University of California at Merced (UC Merced) have delivered handily on that premise according to an Oct. 13, 2017 news item on phys.org,

Jazz musicians riffing with each other, humans talking to each other and pods of killer whales all have interactive conversations that are remarkably similar to each other, new research reveals.

Cognitive science researchers at UC Merced have developed a new method for analyzing and comparing the sounds of speech, music and complex animal vocalizations like whale song and bird song. The paper detailing their findings is being published today [Oct. 12, 2017] in the Journal of the Royal Society Interface.

Their method is based on the idea that these sounds are complex because they have multiple layers of structure. Every language, for instance, has individuals sounds, roughly corresponding to letters, that combine to form syllables, words, phrases, sentences and so on. It’s a hierarchy that everyone understands intuitively. Musical compositions have their own temporal hierarchies, but until now there hasn’t been a way to directly compare the hierarchies of speech and music, or test whether similar hierarchies might exist in bird song and whale song.

An Oct. 12, 2017 UC Merced news release by Lorena Anderson, which originated the news item, provides more details about the investigation (Note: Links have been removed),

“Playing jazz music has been likened to a conversation among musicians, and killer whales are highly social creatures who vocalize as if they are talking to each other. But does jazz music really sound like a conversation, and do killer whales really sound like they are talking?” asked lead researcher and UC Merced professor Chris Kello. “We know killer whales are highly social and intelligent, but it’s hard to tell that they are interacting when you listen to recordings of them. Our method shows how much their sound patterns are like people talking, but not like other, less social whales or birds.”

The researchers figured out a way to measure and compare sound recordings by converting them into “barcodes” that capture clusters of sound energy, and clusters of clusters, across levels of a hierarchy. These barcodes allowed the researchers to directly compare temporal hierarchies in more than 200 recordings of different kinds of speech in six different languages, different kinds of popular and classical music, four different species of birds and whales singing their songs, and even thunderstorms.

Kello and his colleagues have been using the barcode method for several years. They first developed it in studies of conversations. The study published today is the first time that they applied the method to music and animal vocalizations.

“The method allows us to ask questions about language and music and animal songs that we couldn’t ask without a way to see and compare patterns in all these recordings,” Kello said.

A common song

The researchers compared barcode-style visualizations of recorded sounds.
Credit: UC Merced

Kello, fellow UC Merced cognitive science professor Ramesh Balasubramaniam, graduate student Butovens Me´de´ [or Médé] and collaborator professor Simone Dalla Bella also discovered that the haunting songs of huge humpback whales are remarkably similar to the beautiful songs of tiny nightingales and hermit thrushes in terms of their temporal hierarchies.

“Humpbacks, nightingales and hermit thrushes are solitary singers,” Kello said. “The barcodes show that their songs have similar layers of structure, but we don’t know what it means — yet.”

The idea for this project came from Kello’s sabbatical at the University of Montpellier in France, where he worked and discussed ideas with Dalla Bella. Balasubramaniam, who studies how music is perceived, is in the School of Social Sciences, Humanities and Arts with Kello, who studies speech and language processing. The project was a natural collaboration and is part of a growing research focus at UC Merced that was enabled by the National Science Foundation-funded CHASE summer school on Music and Language in 2014, and a Google Faculty Award to Kello.

Balasubramaniam is interested in continuing the work to better understand how brains distinguish between music and speech, while Kello said there are many different avenues to pursue.

For instance, the researchers found nearly identical temporal hierarchies for six different languages, which may suggest something universal about human speech. However, because this result was based on recordings of TED Talks — which have a common style and progression — Kello said it will be important to keep looking at other forms of speech and language.

One of his graduate students, Sara Schneider, is using the method to study the convergence of Spanish and English barcodes in bilingual conversations. Another graduate student, Adolfo Ramirez-Aristizabal, is working with Kello and Balasubramaniam to study whether the barcode method may shed light on how brains process speech and other complex sounds.

“Listening to music and speech, we can hear some of what we see in the barcodes, and the information may be useful for automatic classification of audio recordings. But that doesn’t mean that our brains process music and speech using these barcodes,” Kello said. “It’s intriguing, but we need to keep asking questions and go where the data lead us.”

Here’s a link to and a citation for the paper,

Hierarchical temporal structure in music, speech and animal vocalizations: jazz is like a conversation, humpbacks sing like hermit thrushes by Christopher T. Kello, Simone Dalla Bella, Butovens Médé, Ramesh Balasubramaniam. Journal of the Royal Society Interface DOI: 10.1098/rsif.2017.0231 Published 11 October 2017

This paper appears to be open access.*

*”This paper is behind a paywall” was changed to “… appears to be open access.” at 1700 hours on January 23, 2018.