• DocumentCode
    730785
  • Title

    Robust speech processing using ARMA spectrogram models

  • Author

    Ganapathy, Sriram

  • Author_Institution
    IBM T.J Watson Res. Center, Yorktown Heights, NY, USA
  • fYear
    2015
  • fDate
    19-24 April 2015
  • Firstpage
    5029
  • Lastpage
    5033
  • Abstract
    Speech applications in noisy and degraded channel conditions continue to be a challenging problem especially when there is a mismatch between the training and test conditions. In this paper, a robust speech feature extraction scheme is developed based on autoregressive moving average (ARMA) modeling that emphasizes high energy regions of the signal with a data driven modulation filter. The peak preserving ability of two dimensional autoregressive (AR) models is used to emphasize the high energy regions in the spectrotemporal domain. The modulation filtering property is achieved by moving average (MA) modeling. The ARMA spectrograms are used to derive features for speech recognition in the Aurora-4 database. In these experiments, the ARMA model features provide significant improvements (relative improvements of 15%) compared to other robust features. Furthermore, the robustness of these features is also verified for language identification (LID) of highly degraded radio channel speech. Here, the ARMA approach achieves relative improvements of up to 20% over the baseline features.
  • Keywords
    feature extraction; filtering theory; speech processing; AR models; ARMA modeling; ARMA spectrogram models; Aurora-4 database; LID; autoregressive moving average; baseline features; data driven modulation filter; degraded channel conditions; dimensional autoregressive models; language identification; modulation filtering property; radio channel speech; robust speech feature extraction scheme; robust speech processing; speech applications; speech recognition; Distortion; Mel frequency cepstral coefficient; Predictive models; Robustness; Spectrogram; Speech; Telecommunication standards; ARMA Modeling; Language Identification; Robust Feature Extraction; Speech Recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
  • Conference_Location
    South Brisbane, QLD
  • Type

    conf

  • DOI
    10.1109/ICASSP.2015.7178928
  • Filename
    7178928