Title :
Towards efficient and scalable speech compression schemes for robust speech recognition applications
Author :
Srinivasamurthy, N. ; Ortega, A. ; Zhu, Q. ; Alwan, A.
Author_Institution :
Dept. of Electr. Eng. Syst., Univ. of Southern California, Los Angeles, CA, USA
Abstract :
This paper presents a scheme for distributed automatic speech recognition. A hidden Markov model (HMM)-based speech recognition system with a mel frequency cepstral coefficients (MFCC) front end was used in the evaluation. The goal was to achieve good recognition performance while compressing the MFCC feature vectors. Compression rates and recognition performance for both a digit and an alphabet database are reported. Compared to a scheme of recognizing speech encoded by low bit rate encoders, and previously reported schemes, our method can achieve good recognition performance with bit rates lower than 1 kbps, using low encoding complexity. The encoding algorithms developed are scalable, allowing bit rate and recognition performance trade-offs, and can be combined with unequal error protection or prioritization to allow graceful degradation of performance in the presence of channel errors
Keywords :
cepstral analysis; data compression; hidden Markov models; performance evaluation; speech coding; speech recognition; MFCC feature vectors; alphabet database; channel errors; digit database; distributed automatic speech recognition; encoding complexity; hidden Markov model; low bit rate encoders; mel frequency cepstral coefficients; performance; speech compression; Bit rate; Cepstral analysis; Engines; Hidden Markov models; Mel frequency cepstral coefficient; Quantization; Robustness; Scalability; Speech coding; Speech recognition;
Conference_Titel :
Multimedia and Expo, 2000. ICME 2000. 2000 IEEE International Conference on
Conference_Location :
New York, NY
Print_ISBN :
0-7803-6536-4
DOI :
10.1109/ICME.2000.869589