Katie Wolf - DREU Experience
University of Minnesota -- Oregon State University
- Distributed Research Experience for Undergraduates - Summer 2009 -

The Project - Basics

On this page, I go over some of the basic tools and terminology used on the project.

Segmentation of Bird Sound

Bird sounds can be divided into phrases, syllables, and elements. Below is the spectrogram of the song of a Black Throated Blue Warbler (depicted above) that is split up into the phrases, which contain a series of syllables. Syllables are constructed from elements, but in the case of our research, and in the simple case, elements and syllables are the same. When a wav file is loaded, a spectrogram (like below) is created. A spectrogram is a graph of the frequency (x-axis) vs. time (y-axis) of an audio file. In the spectrograms we are creating, the frequency ranges from high to low as we move up the x-axis, so the higher frequencies are at the bottom of the image and the lower ones are towards the top.

Here is the short recording from the Cornell Maculauy Library that corresponds to the spectrogram above: Black Throated Blue Warbler


The goal of my program is to be able to have a user draw rectangles on the spectrogram above, and label them according to the species. This will allow the ecologists working on the project to label data that will be used as a training set for the machine learning algorithms. The diagram below is an annotated spectrogram.


The data from the Maculauy Library was collected by those in the field using special microphones aimed at the specific bird. As can be seen from the week 1 photos of my weekly log, the microphones used in our research sit in the field and pick up all the surrounding noise. Because of this, there are some issues with the quality of our sound and the amount of background noise that gets picked up.

Here is a short recording of a Hermit Thrush (pictured above) from one of the microphones at the Research Forest and the corresponding spectrogram: Hermit Thrush

The background noise in the recording accounts for a lot of the salt and pepper on the spectrogram. One of the next steps in the research is to try and minimize the background noise so that the birds are easier to distinguish. The black bar at the top of the spectrogram above is one of way that we have minimized some of the noise. By putting in a low bandwidth and by normalizing we have decreased some of the noise from the stream by getting rid a lot of the louder lower sounds. This allows for the bird noise to come out more clearly. Below is a spectrogram with no low bandwidth.