DocumentCode :
249363
Title :
Speech and Singing Discrimination for Audio Data Indexing
Author :
Wei-Ho Tsai ; Cin-Hao Ma
Author_Institution :
Dept. of Electron. Eng., Nat. Taipei Univ. of Technol., Taipei, Taiwan
fYear :
2014
fDate :
June 27 2014-July 2 2014
Firstpage :
276
Lastpage :
280
Abstract :
This study investigates the technique of automatically discriminating speech from singing voices for audio data indexing. We propose a discrimination system based on both timbre and pitch feature analyses. In using timbre features, voice recordings are converted into Mel-Frequency Cepstral Coefficients and their first derivatives and then analyzed using Gaussian mixture models. In using pitch feature, we represent voice recordings as MIDI note sequences and then use bigram models to analyze the dynamic change information of the notes. Our experiments, conducted using a database including 600 test recordings from 10 subjects, show that the proposed system can achieve 94.3% accuracy.
Keywords :
Gaussian processes; audio signal processing; cepstral analysis; feature extraction; indexing; mixture models; speech recognition; Gaussian mixture model; MIDI note sequences; audio data indexing; bigram model; dynamic change information; melfrequency cepstral coefficients; pitch feature analyses; singing discrimination; singing voices; speech discrimination; timbre feature analyses; voice recordings; Accuracy; Feature extraction; Speech; Speech processing; Timbre; pitch; singing; speech; timbre; voice discrimination;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Big Data (BigData Congress), 2014 IEEE International Congress on
Conference_Location :
Anchorage, AK
Print_ISBN :
978-1-4799-5056-0
Type :
conf
DOI :
10.1109/BigData.Congress.2014.138
Filename :
6906790
Link To Document :
بازگشت