Title :
Singing voice detection in pop songs using co-training algorithm
Author :
Khine, Swe Zin Kalayar ; Nwe, Tin Lay ; Li, Haizhou
Author_Institution :
Inst. for Infocomm Res., Singapore
fDate :
March 31 2008-April 4 2008
Abstract :
We propose a co-training algorithm to detect the singing voice segments from the pop songs. Co-training algorithm leverages compatible and partially uncorrelated information across different features to effectively boost the model from unlabeled data. We adopt this technique to take advantage of abundant unlabeled songs and explore the use of different acoustic features including vibrato, harmonic, attack-decay and MFCC (mel frequency cepstral coefficients). The proposed algorithm substantially reduces the amount of manual labeling work and computational cost. The experiments are conducted on the database of 94 pop solo songs. We achieve an average error rate of 17% in segment level singing voice detection.
Keywords :
hidden Markov models; speech recognition; cotraining algorithm; mel frequency cepstral coefficients; pop songs; segment level singing voice detection; unlabeled songs; Acoustic signal detection; Feature extraction; Hidden Markov models; Instruments; Labeling; Mel frequency cepstral coefficient; Music; Spatial databases; Timbre; Web pages; Co-training algorithm; Hidden Markov Model; Singing voice detection; Timbre;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location :
Las Vegas, NV
Print_ISBN :
978-1-4244-1483-3
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2008.4517938