DocumentCode
730785
Title
Robust speech processing using ARMA spectrogram models
Author
Ganapathy, Sriram
Author_Institution
IBM T.J Watson Res. Center, Yorktown Heights, NY, USA
fYear
2015
fDate
19-24 April 2015
Firstpage
5029
Lastpage
5033
Abstract
Speech applications in noisy and degraded channel conditions continue to be a challenging problem especially when there is a mismatch between the training and test conditions. In this paper, a robust speech feature extraction scheme is developed based on autoregressive moving average (ARMA) modeling that emphasizes high energy regions of the signal with a data driven modulation filter. The peak preserving ability of two dimensional autoregressive (AR) models is used to emphasize the high energy regions in the spectrotemporal domain. The modulation filtering property is achieved by moving average (MA) modeling. The ARMA spectrograms are used to derive features for speech recognition in the Aurora-4 database. In these experiments, the ARMA model features provide significant improvements (relative improvements of 15%) compared to other robust features. Furthermore, the robustness of these features is also verified for language identification (LID) of highly degraded radio channel speech. Here, the ARMA approach achieves relative improvements of up to 20% over the baseline features.
Keywords
feature extraction; filtering theory; speech processing; AR models; ARMA modeling; ARMA spectrogram models; Aurora-4 database; LID; autoregressive moving average; baseline features; data driven modulation filter; degraded channel conditions; dimensional autoregressive models; language identification; modulation filtering property; radio channel speech; robust speech feature extraction scheme; robust speech processing; speech applications; speech recognition; Distortion; Mel frequency cepstral coefficient; Predictive models; Robustness; Spectrogram; Speech; Telecommunication standards; ARMA Modeling; Language Identification; Robust Feature Extraction; Speech Recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location
South Brisbane, QLD
Type
conf
DOI
10.1109/ICASSP.2015.7178928
Filename
7178928
Link To Document