DocumentCode :
3518302
Title :
On the importance of modeling temporal information in music tag annotation
Author :
Reed, Jeremy ; Lee, Chin-Hui
Author_Institution :
Sch. of Electr. & Comput. Eng., Georgia Inst. of Technol., Atlanta, GA
fYear :
2009
fDate :
19-24 April 2009
Firstpage :
1873
Lastpage :
1876
Abstract :
Music is an art form in which sounds are organized in time; however, current approaches for determining similarity and classification largely ignore temporal information. This paper presents an approach to automatic tagging which incorporates temporal aspects of music directly into the statistical models, unlike the typical bag-of-frames paradigm in traditional music information retrieval techniques. Vector quantization on song segments leads to a vocabulary of acoustic segment models. An unsupervised, iterative process that cycles between Viterbi decoding and Baum-Welch estimation builds transcripts of this vocabulary. Latent semantic analysis converts the song transcriptions into a vector for subsequent classification using a support vector machine for each tag. Experimental results demonstrate that the proposed approach performs better in 15 of the 18 tags. Further analysis demonstrates an ability to capture local timbral characteristics as well as sequential arrangements of acoustic segment models.
Keywords :
acoustic signal processing; hidden Markov models; information retrieval; iterative methods; music; signal classification; speech processing; statistical analysis; support vector machines; vector quantisation; Baum-Welch estimation; Viterbi decoding; acoustic segment model; bag-of-frames paradigm; iterative process; latent semantic analysis; local timbral characteristic; music information retrieval technique; music tag annotation; song segments; statistical model; support vector machine; vector quantization; Automatic speech recognition; Hidden Markov models; Multiple signal classification; Music information retrieval; Support vector machine classification; Support vector machines; Tagging; Technical Activities Guide -TAG; Vector quantization; Vocabulary; Hidden Markov models; Information retrieval; Music; Speech processing; Vector quantization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Conference_Location :
Taipei
ISSN :
1520-6149
Print_ISBN :
978-1-4244-2353-8
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2009.4959973
Filename :
4959973
Link To Document :
بازگشت