• DocumentCode
    2414934
  • Title

    Automatic Song-Type Classification and Speaker Identification of Norwegian Ortolan Bunting (Emberiza Hortulana) Vocalizations

  • Author

    Trawicki, Marek B. ; Johnson, Michael T. ; Osiejuk, Tomasz S.

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Marquette Univ., Milwaukee, WI
  • fYear
    2005
  • fDate
    28-28 Sept. 2005
  • Firstpage
    277
  • Lastpage
    282
  • Abstract
    This paper presents an approach to song-type classification and speaker identification of Norwegian Ortolan Bunting (Emberiza Hortulana) vocalizations using traditional human speech processing methods. Hidden Markov models (HMMs) are used for both tasks, with features including mel-frequency cepstral coefficients (MFCCs), log energy, and delta (velocity) and delta-delta (acceleration) coefficients. Vocalizations were tested using leave-one-out cross-validation. Classification accuracy for 5 song-types is 92.4%, dropping to 63.6% as the number and similarity of the songs increases. Song-type dependent speaker identification rates peak at 98.7%, with typical accuracies of 80-95% and a low end at 76.2% as the number of speakers increases. These experiments fit into a larger framework of research working towards methods for acoustic censusing of endangered species populations and more automated bioacoustic analysis methods
  • Keywords
    acoustic signal processing; audio signal processing; bioacoustics; cepstral analysis; hidden Markov models; signal classification; speaker recognition; speech processing; Emberiza Hortulana vocalization; Norwegian Ortolan Bunting vocalization; acceleration coefficients; acoustic censusing; bioacoustic analysis; delta-delta coefficients; endangered species population; hidden Markov models; human speech processing; log energy; mel-frequency cepstral coefficients; song-type classification; speaker identification; velocity coefficients; Acceleration; Biomedical signal processing; Birds; Cepstral analysis; Environmental factors; Hidden Markov models; Humans; Loudspeakers; Speech processing; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning for Signal Processing, 2005 IEEE Workshop on
  • Conference_Location
    Mystic, CT
  • Print_ISBN
    0-7803-9517-4
  • Type

    conf

  • DOI
    10.1109/MLSP.2005.1532913
  • Filename
    1532913