• DocumentCode
    2329810
  • Title

    Analysis on speech characteristics for robust voice activity detection

  • Author

    Espi, Miquel ; Miyabe, Shigeki ; Nishimoto, Takuya ; Ono, Nobutaka ; Sagayama, Shigeki

  • Author_Institution
    Grad. Sch. of Inf. Sci. & Technol., Univ. of Tokyo, Tokyo, Japan
  • fYear
    2010
  • fDate
    12-15 Dec. 2010
  • Firstpage
    151
  • Lastpage
    156
  • Abstract
    This paper discusses about effective speech characterization for off-line voice activity detection (VAD), which is an important step prior to speech data mining. Five different natures of speech are examined; energy, spectral shape, periodicity, phonetic variation, and spectral fluctuation, the latter observed from a new point of view. Specific spectral fluctuation patterns of speech have been analyzed using multi-stage Harmonic/Percussive Sound Separation algorithm. We compared the performance of the features, and various combinations, to evaluate their robustness in multiple noise environments. The combined approach outperformed the baseline of CENSREC-1-C evaluation framework. The results suggest that the proposed feature extraction approach can improve state of the art VAD methods.
  • Keywords
    data mining; speech recognition; CENSREC 1-C evaluation framework; feature extraction; multistage harmonic-percussive sound separation algorithm; robust voice activity detection; speech characteristics analysis; speech data mining; Delta cepstrum; Harmonic-Percussive Sound Separation; Speech characterization; Voice Activity Detection;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language Technology Workshop (SLT), 2010 IEEE
  • Conference_Location
    Berkeley, CA
  • Print_ISBN
    978-1-4244-7904-7
  • Electronic_ISBN
    978-1-4244-7902-3
  • Type

    conf

  • DOI
    10.1109/SLT.2010.5700838
  • Filename
    5700838