DocumentCode :
2529001
Title :
Statistical Modeling and Retrieval of Polyphonic Music
Author :
Unal, Erdem ; Georgiou, Panayiotis G. ; Narayanan, Shrikanth S. ; Chew, Elaine
Author_Institution :
Southern California Univ., Los Angeles
fYear :
2007
fDate :
1-3 Oct. 2007
Firstpage :
405
Lastpage :
409
Abstract :
In this article, we propose a solution to the problem of query by example for polyphonic music audio. We first present a generic mid-level representation for audio queries. Unlike previous efforts in the literature, the proposed representation is not dependent on the different spectral characteristics of different musical instruments and the accurate location of note onsets and offsets. This is achieved by first mapping the short term frequency spectrum of consecutive audio frames to the musical space (the spiral array) and defining a tonal identity with respect to center of effect that is generated by the spectral weights of the musical notes. We then use the resulting single dimensional text representations of the audio to create a-gram statistical sequence models to track the tonal characteristics and the behavior of the pieces. After performing appropriate smoothing, we build a collection of melodic n-gram models for testing. Using perplexity-based scoring, we test the likelihood of a sequence of lexical chords (an audio query) given each model in the database collection. Initial results show that, some variations of the input piece appears in the top 5 results 81% of the time for whole melody inputs within a 500 polyphonic melody database. We also tested the retrieval engine for small audio clips. Using 25s segments, variations of the input piece are among the top 5 results 75% of the time.
Keywords :
audio signal processing; music; query processing; signal representation; smoothing methods; audio query representation; frequency spectrum; lexical chords; n-gram statistical sequence models; perplexity-based scoring; polyphonic music retrieval; smoothing; statistical modeling; Audio databases; Hidden Markov models; Instruments; Laboratories; Multiple signal classification; Music information retrieval; Signal processing algorithms; Spatial databases; Speech analysis; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Multimedia Signal Processing, 2007. MMSP 2007. IEEE 9th Workshop on
Conference_Location :
Crete
Print_ISBN :
978-1-4244-1274-7
Type :
conf
DOI :
10.1109/MMSP.2007.4412902
Filename :
4412902
Link To Document :
بازگشت