DocumentCode :
77465
Title :
Continuous Birdsong Recognition Using Gaussian Mixture Modeling of Image Shape Features
Author :
Chang-Hsing Lee ; Sheng-Bin Hsu ; Jau-Ling Shih ; Chih-Hsun Chou
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Chung Hua Univ., Hsinchu, Taiwan
Volume :
15
Issue :
2
fYear :
2013
fDate :
Feb. 2013
Firstpage :
454
Lastpage :
464
Abstract :
Traditional birdsong recognition approaches used acoustic features based on the acoustic model of speech production or the perceptual model of the human auditory system to identify the associated bird species. In this paper, a new feature descriptor that uses image shape features is proposed to identify bird species based on the recognition of fixed-duration birdsong segments where their corresponding spectrograms are viewed as gray-level images. The MPEG-7 angular radial transform (ART) descriptor, which can compactly and efficiently describe the gray-level variations within an image region in both angular and radial directions, will be employed to extract the shape features from the spectrogram image. To effectively capture both frequency and temporal variations within a birdsong segment using ART, a sector expansion algorithm is proposed to transform its spectrogram image into a corresponding sector image such that the frequency and temporal axes of the spectrogram image will align with the radial and angular directions of the ART basis functions, respectively. For the classification of 28 bird species using Gaussian mixture models (GMM), the best classification accuracy is 86.30% and 94.62% for 3-second and 5-second birdsong segments using the proposed ART descriptor, which is better than traditional descriptors such as LPCC, MFCC, and TDMFCC.
Keywords :
Gaussian processes; biology computing; feature extraction; pattern recognition; signal classification; Gaussian mixture modeling; LPCC descriptor; MFCC descriptor; MPEG-7 ART descriptor; TDMFCC descriptor; acoustic feature; angular direction; angular radial transform; associated bird species; classification accuracy; continuous birdsong recognition; feature descriptor; feature extraction; fixed-duration birdsong segment; frequency variation; gray-level image; human auditory system; image shape feature; radial direction; sector expansion algorithm; spectrogram image; speech production; temporal variation; Acoustics; Birds; Feature extraction; Image recognition; Spectrogram; Subspace constraints; Vectors; Angular radial transform (ART); Gaussian mixture models (GMM); birdsong recognition;
fLanguage :
English
Journal_Title :
Multimedia, IEEE Transactions on
Publisher :
ieee
ISSN :
1520-9210
Type :
jour
DOI :
10.1109/TMM.2012.2229969
Filename :
6362230
Link To Document :
بازگشت