Title :
Transform representation of the spectra of acoustic speech segments with applications. I. General approach and application to speech recognition
Author :
Algazi, V. Ralph ; Brown, Kathy L. ; Ready, Michael J. ; Irvine, David H. ; Cadwell, Christie L. ; Chung, Sang
Author_Institution :
Center for Image Process. & Integrated Comput., California Univ., Davis, CA, USA
fDate :
4/1/1993 12:00:00 AM
Abstract :
An approach to modeling and capturing the time-varying structure of the spectral envelope of speech is reported. Acoustic subword decomposition and the Karhunen-Loeve transform (KLT) are used to extract and efficiently represent the highly correlated structure of the spectral envelope. Integration of the KLT with acoustic subword modeling provides concise representation of both steady-state and dynamic features of the spectra in a unified framework that very effectively captures acoustic-phonetic patterns. The physiological and perceptual basis for the approach, the frame-based and acoustic-subword-based spectral representation, and applications to speaker-dependent recognition are presented. The performance of the recognition algorithm based on this approach compares favorably with that of other techniques
Keywords :
spectral analysis; speech analysis and processing; speech recognition; transforms; KLT; Karhunen-Loeve transform; acoustic speech segments; acoustic subword decomposition; acoustic subword modeling; acoustic-phonetic patterns; dynamic features; frame-based representation; speaker-dependent recognition; spectral envelope; speech recognition; steady-state features; time-varying structure; Acoustic applications; Feature extraction; Helium; Signal analysis; Signal processing; Speech analysis; Speech enhancement; Speech processing; Speech recognition; Speech synthesis;
Journal_Title :
Speech and Audio Processing, IEEE Transactions on