• DocumentCode
    2479229
  • Title

    Integrating Articulatory based Features with Auditory Based Features for Robust Stressed Speech Recognition

  • Author

    Nwe, Tin Lay ; Li, Haizhou ; Wang, Ye

  • Author_Institution
    Inst. for Infocomm Res.
  • fYear
    0
  • fDate
    0-0 0
  • Firstpage
    1334
  • Lastpage
    1338
  • Abstract
    Intra-speaker variations due to perceptually induced stress or emotion adversely affect speech recognition system performance. In this paper, we combine auditory based (Mel frequency cepstral coefficients and linear predictive cepstral coefficients) features and articulatory based (voicedness) features for robust speech recognition. Voicedness features are derived using linear and teager energy operator (TEO) based nonlinear fast Fourier transform (FFT) spectra. Nonlinear properties are analyzed in both the time and frequency domains. In addition, we investigate the sensitivity of all these FFT spectra to stress and observe the performance of individual FFT spectra. The system is tested using stressed speech data from the speech under simulated and actual stress (SUSAS) database. The results show that articulatory based features help to improve the system performance. Furthermore, significant performance improvement has been observed when using the FFT spectrum which is less sensitive to stress
  • Keywords
    cepstral analysis; fast Fourier transforms; speech processing; speech recognition; time-frequency analysis; FFT; SUSAS database; TEO; articulatory based feature; auditory based feature; fast Fourier transform spectra; nonlinear property; sensitivity; speech recognition; stressed speech data; system performance; teager energy operator; time-frequency domain; voicedness; Cepstral analysis; Fast Fourier transforms; Frequency domain analysis; Mel frequency cepstral coefficient; Robustness; Spatial databases; Speech recognition; Stress; System performance; System testing; Articulatory feature; Nonlinear FFT spectrum; Robust speech recognition; Stressed speech;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information, Communications and Signal Processing, 2005 Fifth International Conference on
  • Conference_Location
    Bangkok
  • Print_ISBN
    0-7803-9283-3
  • Type

    conf

  • DOI
    10.1109/ICICS.2005.1689273
  • Filename
    1689273