Title :
Cepstral domain talker stress compensation for robust speech recognition
Author_Institution :
MIT Lincoln Lab., Lexington, MA, USA
fDate :
4/1/1988 12:00:00 AM
Abstract :
A study of talker-stress-induced intraword variability and an algorithm that compensates for the systematic changes observed are presented. The study is based on hidden Markov models trained by speech tokens spoken in various talking styles. The talking styles include normal speech, fast speech, loud speech, soft speech, and taking with noise injected through earphones; the styles are designed to simulate speech produced under real stressful conditions. Cepstral coefficients are used as the parameters in the hidden Markov models. The stress compensation algorithm compensates for the variations in the cepstral coefficients in a hypothesis-driven manner. The functional form of the compensation is shown to correspond to the equalization of spectral tilts. Substantial reduction of error rates has been achieved when the cepstral domain compensation techniques were tested on the simulated-stress speech database. The hypothesis-driven compensation technique reduced the average error rate from 13.9% to 6.2%. When a more sophisticated recognizer was used, it reduced the error rate from 2.5% to 1.9%
Keywords :
Markov processes; speech recognition; cepstral coefficients; cepstral domain compensation; earphones; error rates; fast speech; hidden Markov models; hypothesis-driven compensation; loud speech; noise; normal speech; soft speech; speech recognition; speech tokens; stress compensation algorithm; talker stress compensation; talking styles; Cepstral analysis; Databases; Degradation; Error analysis; Hidden Markov models; Human factors; Robustness; Speech enhancement; Speech recognition; Stress;
Journal_Title :
Acoustics, Speech and Signal Processing, IEEE Transactions on