DocumentCode :
1756754
Title :
A Bag of Systems Representation for Music Auto-Tagging
Author :
Ellis, K. ; Coviello, Emanuele ; Chan, Antoni B. ; Lanckriet, Gert
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of California at San Diego, La Jolla, CA, USA
Volume :
21
Issue :
12
fYear :
2013
fDate :
Dec. 2013
Firstpage :
2554
Lastpage :
2569
Abstract :
We present a content-based automatic tagging system for music that relies on a high-level, concise “Bag of Systems” (BoS) representation of the characteristics of a musical piece. The BoS representation leverages a rich dictionary of musical codewords, where each codeword is a generative model that captures timbral and temporal characteristics of music. Songs are represented as a BoS histogram over codewords, which allows for the use of traditional algorithms for text document retrieval to perform auto-tagging. Compared to estimating a single generative model to directly capture the musical characteristics of songs associated with a tag, the BoS approach offers the flexibility to combine different generative models at various time resolutions through the selection of the BoS codewords. Additionally, decoupling the modeling of audio characteristics from the modeling of tag-specific patterns makes BoS a more robust and rich representation of music. Experiments show that this leads to superior auto-tagging performance.
Keywords :
content-based retrieval; music; BoS codewords; BoS histogram; BoS representation; audio characteristics; bag of systems representation; content-based automatic tagging system; generative model; music auto-tagging; musical characteristics; musical codewords; musical piece; tag-specific patterns; temporal characteristics; text document retrieval; timbral characteristics; time resolutions; Computational modeling; Data models; Feature extraction; Hidden Markov models; Histograms; Music; Tagging; Audio annotation and retrieval; bag of systems; content-based music processing; dynamic texture model; music information retrieval;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2013.2279318
Filename :
6583960
Link To Document :
بازگشت