HMM-based modelling of individual syllables for bird species recognition from audio field recordings

Author

Jancovic, Peter ; Zakeri, Masoud ; Kokuer, Munevver ; Russell, Martin

Author_Institution

Sch. of Electron., Univ. of Birmingham, Birmingham, UK

fYear

2015

fDate

19-24 April 2015

Firstpage

768

Lastpage

772

Abstract

This paper presents an automatic system for recognition of bird species from audio field recordings. The acoustic signal is first segmented into isolated time-frequency segments, each corresponding to an individual detected sinusoidal component. Each segment is represented by a temporal sequence of the frequency values of the detected sinusoid, referred to as frequency track. Hidden Markov models (HMMs) are employed to model the temporal evolution of frequency track features. Individual syllables of bird vocalisations are discovered using an unsupervised method based on dynamic time warping and agglomerative hierarchical clustering. The outcome of this is then employed to create individual HMMs for syllables of each species. Experiments are performed on over 33 hours of field recordings, containing 30 bird species. Evaluations demonstrate that the use of individual syllable HMMs provides over 40% error rate reduction over the use of single HMM for each bird species of the same complexity. The syllable HMM-based system recognises bird species with accuracy over 95% using 3 seconds of detected signal.

Keywords

acoustic signal detection; audio recording; hidden Markov models; time-frequency analysis; HMM; acoustic signal segmentation; agglomerative hierarchical clustering; audio field recordings; bird species recognition; bird vocalisations; dynamic time warping; error rate reduction; frequency track features; hidden Markov models; individual detected sinusoidal component; individual syllables; isolated time-frequency segments; temporal sequence; unsupervised method; Accuracy; Acoustics; Biological system modeling; Birds; Feature extraction; Hidden Markov models; Speech; DTW; bird species recognition; frequency track; hidden Markov model; segmentation; sinusoid detection; syllable; unsupervised clustering;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on

Conference_Location

South Brisbane, QLD

Type

conf

DOI

10.1109/ICASSP.2015.7178073

Filename

7178073