DocumentCode :
3488183
Title :
A fusion study in speech/music classification
Author :
Pinquier, Julien ; Rouas, Jean-Luc ; André-Óbrecht, Régine
Author_Institution :
Inst. de Recherche en Informatique de Toulouse, CNRS, Toulouse, France
Volume :
2
fYear :
2003
fDate :
6-10 April 2003
Abstract :
We present and merge two speech/music classification approaches that we have developed. The first one is a differentiated modeling approach based on a spectral analysis, which is implemented using GMM (Gaussian mixture model). The other one is based on three original features: entropy modulation, stationary segment duration and number of segments. They are merged with the classical 4 Hertz modulation energy. Our classification system is a fusion of the two approaches. It is divided in two classifications (speech/non-speech and music/non-music) and provides 94% of accuracy for speech detection and 90% for music detection, with one second of input signal. Beside the spectral information and GMM, classically used in speech/music discrimination, simple parameters bring complementary and efficient information.
Keywords :
Gaussian processes; audio signal processing; entropy; feature extraction; music; pattern classification; spectral analysis; speech processing; speech recognition; GMM; Gaussian mixture model; acoustic component extraction; differentiated modeling approach; entropy modulation; modulation energy; music detection; music features; spectral analysis; speech detection; speech features; speech/music classification; stationary segment duration; Acoustic signal detection; Cepstral analysis; Data mining; Entropy; Indexing; Indium phosphide; Multiple signal classification; Music; Spectral analysis; Speech;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
ISSN :
1520-6149
Print_ISBN :
0-7803-7663-3
Type :
conf
DOI :
10.1109/ICASSP.2003.1202283
Filename :
1202283
Link To Document :
بازگشت