DocumentCode
2575162
Title
Discrimination of speech and monophonic singing in continuous audio streams applying multi-layer support vector machines
Author
Schuller, Björn ; Rigoll, Gerhard ; Lang, Manfred
Author_Institution
Inst. for Human-Machine Commun., Technische Univ. Munchen, Germany
Volume
3
fYear
2004
fDate
27-30 June 2004
Firstpage
1655
Abstract
We present a novel approach to the discrimination of speech and monophonic singing for use in music information retrieval applications. A working prototype is introduced, applying multi-layer support vector machines for the discrimination and static high-level features derived from the pitch and energy contours of an acoustic signal. The feature set for discrimination is presented and ranked according to a linear discriminant analysis. For the automatic segmentation within an input signal stream, a further feature set is used for the discrimination of signal and noise. A corpus for training and evaluation comprising speech and monophonic singing data of nine performers is described in detail. The data has been labeled according to the judgment of another set of probands. A recognition rate of correct assignments of 99.2% could be reached, and demonstrates the high performance of the proposed methods.
Keywords
acoustic noise; audio signal processing; natural language interfaces; random noise; speech processing; speech recognition; support vector machines; acoustic signal; content-based retrieval; continuous audio streams; energy contours; linear discriminant analysis; monophonic singing; multilayer support vector machines; music information retrieval; natural speech utterance; pitch contours; speech discrimination; speech-singing discrimination; static high-level features; Acoustic noise; Databases; Labeling; Linear discriminant analysis; Man machine systems; Music information retrieval; Prototypes; Speech analysis; Streaming media; Support vector machines;
fLanguage
English
Publisher
ieee
Conference_Titel
Multimedia and Expo, 2004. ICME '04. 2004 IEEE International Conference on
Print_ISBN
0-7803-8603-5
Type
conf
DOI
10.1109/ICME.2004.1394569
Filename
1394569
Link To Document