DocumentCode :
1017637
Title :
A Query-by-Singing System for Retrieving Karaoke Music
Author :
Yu, Hung-Ming ; Tsai, Wei-Ho ; Wang, Hsin-Min
Author_Institution :
Inst. of Inf. Sci., Acad. Sinica, Taipei
Volume :
10
Issue :
8
fYear :
2008
Firstpage :
1626
Lastpage :
1637
Abstract :
This paper investigates the problem of retrieving karaoke music using query-by-singing techniques. Unlike regular CD music, where the stereo sound involves two audio channels that usually sound the same, karaoke music encompasses two distinct channels in each track: one is a mixture of the lead vocals and background accompaniment, and the other consists of accompaniment only. Although the two audio channels are distinct, the accompaniments in the two channels often resemble each other. We exploit this characteristic to: i) infer the background accompaniment for the lead vocals from the accompaniment-only channel, so that the main melody underlying the lead vocals can be extracted more effectively, and ii) detect phrase onsets based on the Bayesian information criterion (BIC) to predict the onset points of a song where a user´s sung query may begin, so that the similarity between the melodies of the query and the song can be examined more efficiently. To further refine extraction of the main melody, we propose correcting potential errors in the estimated sung notes by exploiting a composition characteristic of popular songs whereby the sung notes within a verse or chorus section usually vary no more than two octaves. In addition, to facilitate an efficient and accurate search of a large music database, we employ multiple-pass dynamic time warping (DTW) combined with multiple-level data abstraction (MLDA) to compare the similarities of melodies. The results of experiments conducted on a karaoke database comprised of 1071 popular songs demonstrate the feasibility of query-by-singing retrieval for karaoke music.
Keywords :
Bayes methods; audio databases; data structures; music; query formulation; Bayesian information criterion; accompaniment-only channel; audio channel; background accompaniment; karaoke database; karaoke music; lead vocal; multiple-level data abstraction; multiple-pass dynamic time warping; music database; music retrieval; query-by-singing retrieval; query-by-singing system; Bayesian methods; Councils; Data mining; Databases; Error correction; Humans; Internet; Multiple signal classification; Music information retrieval; Signal processing; Bayesian information criterion; dynamic time warping; karaoke; music information retrieval; query-by-singing;
fLanguage :
English
Journal_Title :
Multimedia, IEEE Transactions on
Publisher :
ieee
ISSN :
1520-9210
Type :
jour
DOI :
10.1109/TMM.2008.2007345
Filename :
4694852
Link To Document :
بازگشت