DocumentCode
2329810
Title
Analysis on speech characteristics for robust voice activity detection
Author
Espi, Miquel ; Miyabe, Shigeki ; Nishimoto, Takuya ; Ono, Nobutaka ; Sagayama, Shigeki
Author_Institution
Grad. Sch. of Inf. Sci. & Technol., Univ. of Tokyo, Tokyo, Japan
fYear
2010
fDate
12-15 Dec. 2010
Firstpage
151
Lastpage
156
Abstract
This paper discusses about effective speech characterization for off-line voice activity detection (VAD), which is an important step prior to speech data mining. Five different natures of speech are examined; energy, spectral shape, periodicity, phonetic variation, and spectral fluctuation, the latter observed from a new point of view. Specific spectral fluctuation patterns of speech have been analyzed using multi-stage Harmonic/Percussive Sound Separation algorithm. We compared the performance of the features, and various combinations, to evaluate their robustness in multiple noise environments. The combined approach outperformed the baseline of CENSREC-1-C evaluation framework. The results suggest that the proposed feature extraction approach can improve state of the art VAD methods.
Keywords
data mining; speech recognition; CENSREC 1-C evaluation framework; feature extraction; multistage harmonic-percussive sound separation algorithm; robust voice activity detection; speech characteristics analysis; speech data mining; Delta cepstrum; Harmonic-Percussive Sound Separation; Speech characterization; Voice Activity Detection;
fLanguage
English
Publisher
ieee
Conference_Titel
Spoken Language Technology Workshop (SLT), 2010 IEEE
Conference_Location
Berkeley, CA
Print_ISBN
978-1-4244-7904-7
Electronic_ISBN
978-1-4244-7902-3
Type
conf
DOI
10.1109/SLT.2010.5700838
Filename
5700838
Link To Document