Title :
Speech/Music Discrimination using Spectral Peak Feature for Speaker Indexing
Author :
Keum, Ji-Soo ; Lee, Hyon-Soo
Author_Institution :
Dept. of Comput. Eng., Kyung Hee Univ., Seoul
Abstract :
We present a new speech/music discrimination method based on spectral peak feature and spectral peak´s duration threshold. The focus is feature extraction that reflects the spectral peak´s duration characteristic. Also, we consider fast discrimination and high performance. We extract the spectral peak feature from audio spectrum´s each peak track and normalize the feature by length of segment. The extracted spectral peak´s duration feature can be easily discriminated the speech and music using the duration threshold. We evaluate our method on speech (Korean, English, Chinese and Japanese) and various kinds of pop-music (ballad, rock etc.) for 26,773 seconds of audio data. The average accuracy is 96.21% for speech and 89.49% for music. It was found from the experimental result that our feature vector is suitable for speech/music discrimination and it is computational efficient
Keywords :
feature extraction; music; speech processing; Chinese; English; Japanese; Korean; audio spectrum; duration threshold; feature extraction; speaker indexing; spectral peak feature; speech-music discrimination; Data mining; Feature extraction; Indexing; Loudspeakers; Mel frequency cepstral coefficient; Multiple signal classification; Music; Spectral analysis; Speech analysis; Speech processing;
Conference_Titel :
Intelligent Signal Processing and Communications, 2006. ISPACS '06. International Symposium on
Conference_Location :
Yonago
Print_ISBN :
0-7803-9732-0
Electronic_ISBN :
0-7803-9733-9
DOI :
10.1109/ISPACS.2006.364897