Title :
Discrimination of speech and monophonic singing in continuous audio streams applying multi-layer support vector machines
Author :
Schuller, Björn ; Rigoll, Gerhard ; Lang, Manfred
Author_Institution :
Inst. for Human-Machine Commun., Technische Univ. Munchen, Germany
Abstract :
We present a novel approach to the discrimination of speech and monophonic singing for use in music information retrieval applications. A working prototype is introduced, applying multi-layer support vector machines for the discrimination and static high-level features derived from the pitch and energy contours of an acoustic signal. The feature set for discrimination is presented and ranked according to a linear discriminant analysis. For the automatic segmentation within an input signal stream, a further feature set is used for the discrimination of signal and noise. A corpus for training and evaluation comprising speech and monophonic singing data of nine performers is described in detail. The data has been labeled according to the judgment of another set of probands. A recognition rate of correct assignments of 99.2% could be reached, and demonstrates the high performance of the proposed methods.
Keywords :
acoustic noise; audio signal processing; natural language interfaces; random noise; speech processing; speech recognition; support vector machines; acoustic signal; content-based retrieval; continuous audio streams; energy contours; linear discriminant analysis; monophonic singing; multilayer support vector machines; music information retrieval; natural speech utterance; pitch contours; speech discrimination; speech-singing discrimination; static high-level features; Acoustic noise; Databases; Labeling; Linear discriminant analysis; Man machine systems; Music information retrieval; Prototypes; Speech analysis; Streaming media; Support vector machines;
Conference_Titel :
Multimedia and Expo, 2004. ICME '04. 2004 IEEE International Conference on
Print_ISBN :
0-7803-8603-5
DOI :
10.1109/ICME.2004.1394569