DocumentCode
394703
Title
Multidimensional humming transcription using a statistical approach for query by humming systems
Author
Shih, Hsuan-Huei ; Narayanan, Shrikanth S. ; Kuo, C. C Jay
Author_Institution
Dept. of Electr. Eng., Univ. of Southern California, Los Angeles, CA, USA
Volume
5
fYear
2003
fDate
6-10 April 2003
Abstract
A new statistical pattern recognition approach applied to human humming transcription is proposed. A musical note has two important attributes, i.e. pitch and duration. The proposed algorithm generates multidimensional humming transcriptions, which contain both pitch and duration information. Query by humming provides a natural means for content-based retrieval from music databases, and this research provides a robust frontend for such an application. The segment of a note in the humming waveform is modeled by a hidden Markov model (HMM), while the pitch of the note is modeled by a pitch model using a Gaussian mixture model. Preliminary real-time recognition experiments are carried out with models trained by data obtained from eight human subjects, and an overall correct recognition rate of around 80% is demonstrated.
Keywords
Gaussian processes; audio databases; content-based retrieval; hidden Markov models; multidimensional signal processing; music; pattern recognition; statistical analysis; Gaussian mixture model; HMM; content-based retrieval; duration; hidden Markov model; multidimensional humming transcription; music databases; pitch; statistical pattern recognition; Content based retrieval; Hidden Markov models; Humans; Indexing; Multidimensional systems; Multimedia databases; Music information retrieval; Pattern recognition; Robustness; Signal processing algorithms;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
ISSN
1520-6149
Print_ISBN
0-7803-7663-3
Type
conf
DOI
10.1109/ICASSP.2003.1200026
Filename
1200026
Link To Document