DocumentCode
2576065
Title
Sources of degradation of speech recognition in the telephone network
Author
Moreno, Pedro J. ; Stern, Richard M.
Author_Institution
Dept. of Electr. & Comput. Eng., Carnegie Mellon Univ., Pittsburgh, PA, USA
fYear
1994
fDate
19-22 Apr 1994
Abstract
We compare speech recognition accuracy for high-quality speech recorded under controlled conditions with speech as it appears over long-distance telephone lines. In addition to comparing recognition accuracy we use telephone-channel simulation to identify the sources of degradation of speech over telephone lines that have the greatest impact on speech recognition accuracy. We first compare the performance of the CMU SPHINX-I system on the TIMIT and NTIMIT databases. We found that other factors beyond a mere decrease in bandwidth cause the observed degradation in recognition accuracy, and that the environmental compensation algorithms RASTA and CDCN fail to compensate completely for degradations introduced by the telephone network. We identify the most problematic telephone-channel impairments using a commercial telephone channel simulator and the SPHINX-II system. Of the various effects considered, additive noise and linear filtering appear to have the greatest impact on recognition accuracy. Finally, we examined the performance of three cepstral compensation algorithms in the presence of the most damaging conditions. We found the compensation algorithms to be effective except for the worst 1% of the telephone channels
Keywords
filtering theory; noise; speech intelligibility; speech recognition; telecommunication channels; telephone lines; telephone networks; telephony; CDCN; CMU SPHINX-I system; NTIMIT database; RASTA; SPHINX-II system; TIMIT database; additive noise; bandwidth; cepstral compensation algorithms; environmental compensation algorithms; high-quality speech; linear filtering; long-distance telephone lines; speech recognition accuracy; speech recognition degradations; telephone channel impairments; telephone channel simulator; telephone network; telephone-channel simulation; Additive noise; Band pass filters; Bandwidth; Computer science; Databases; Degradation; Intelligent networks; Maximum likelihood detection; Speech recognition; Telephony;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 1994. ICASSP-94., 1994 IEEE International Conference on
Conference_Location
Adelaide, SA
ISSN
1520-6149
Print_ISBN
0-7803-1775-0
Type
conf
DOI
10.1109/ICASSP.1994.389343
Filename
389343
Link To Document