Title :
Analysis on speech characteristics for robust voice activity detection
Author :
Espi, Miquel ; Miyabe, Shigeki ; Nishimoto, Takuya ; Ono, Nobutaka ; Sagayama, Shigeki
Author_Institution :
Grad. Sch. of Inf. Sci. & Technol., Univ. of Tokyo, Tokyo, Japan
Abstract :
This paper discusses about effective speech characterization for off-line voice activity detection (VAD), which is an important step prior to speech data mining. Five different natures of speech are examined; energy, spectral shape, periodicity, phonetic variation, and spectral fluctuation, the latter observed from a new point of view. Specific spectral fluctuation patterns of speech have been analyzed using multi-stage Harmonic/Percussive Sound Separation algorithm. We compared the performance of the features, and various combinations, to evaluate their robustness in multiple noise environments. The combined approach outperformed the baseline of CENSREC-1-C evaluation framework. The results suggest that the proposed feature extraction approach can improve state of the art VAD methods.
Keywords :
data mining; speech recognition; CENSREC 1-C evaluation framework; feature extraction; multistage harmonic-percussive sound separation algorithm; robust voice activity detection; speech characteristics analysis; speech data mining; Delta cepstrum; Harmonic-Percussive Sound Separation; Speech characterization; Voice Activity Detection;
Conference_Titel :
Spoken Language Technology Workshop (SLT), 2010 IEEE
Conference_Location :
Berkeley, CA
Print_ISBN :
978-1-4244-7904-7
Electronic_ISBN :
978-1-4244-7902-3
DOI :
10.1109/SLT.2010.5700838