Title :
Time Frequency Representation for Speech Recognition
Author :
Amsalem, Avishay ; Shallom, Ilan D.
Author_Institution :
Electr. Eng. Dept., Ben-Gurion Univ., Beer-Sheva
Abstract :
In the field of speech recognition it has been shown that incorporating the dynamics of speech has increased recognition success. This concept is presented in Mel frequency cepstral coefficients (MFCC) and its derivatives which present both the static and the dynamics of the vocal tract. In this paper, a new method for capturing the dynamic features of non- stationary speech signals is presented. The proposed approach is based upon the isolation of each cepstral band and projecting it onto orthogonal space, spanned by a set of well defined orthogonal functions. The major idea is to capture and present energy transitions between successive short term speech frames, along a non-stationary segment about 100 ms. Non stationary speech segments have been represented by time-frequency representations (TFR) and the analysis was modified to fit a two dimensional data. The introduced features evaluation conducted on the TIDIGIT corpus revealed an average of 58% improvement in word error rate, compared to MFCC and its derivatives in the context of isolated speech recognition in noisy environments.
Keywords :
speech recognition; Mel frequency cepstral coefficients; TIDIGIT corpus; cepstral band isolation; nonstationary speech signals; speech recognition; time frequency representation; Automatic speech recognition; Cepstral analysis; Feature extraction; Mel frequency cepstral coefficient; Noise level; Noise robustness; Speech analysis; Speech recognition; Time frequency analysis; Working environment noise; Automatic Speech Recognition; Basis Function Families; Mel Frequency Filter Bank; Orthogonal Projection; Speech Processing;
Conference_Titel :
Information Technology: Research and Education, 2006. ITRE '06. International Conference on
Conference_Location :
Tel-Aviv
Print_ISBN :
1-4244-0858-X
Electronic_ISBN :
1-4244-0859-8
DOI :
10.1109/ITRE.2006.381542