DocumentCode :
3030841
Title :
Recognition of continuously read natural corpus
Author :
Bahl, L. ; Baker, J.K. ; Cohen, P.S. ; Jelinek, F. ; Lewis, B.L. ; Mercer, R.L.
Author_Institution :
IBM T. J. Watson Research Center, Yorktown Heights, NY, USA
Volume :
3
fYear :
1978
fDate :
28581
Firstpage :
422
Lastpage :
424
Abstract :
Preliminary results have been obtained with a system for recognizing continuously read sentences from a naturally-occurring corpus (Laser Patents), restricted to a 1000-word vocabulary. Our model of the task language has an entropy of about 4.8 bits/word and a perplexity of 21.11 words. Many new problems arise in recognition of a substantial natural corpus (compared to recognition of an artificially constrained language). Some techniques are described for treating these problems. On a test set consisting of 20 sentences having a total of 486 words, there was a word error rate of 33.1%.
Keywords :
Acoustic measurements; Decoding; Dictionaries; Erbium; Humans; Iterative algorithms; Loudspeakers; Mathematical model; Natural languages; Probability;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '78.
Type :
conf
DOI :
10.1109/ICASSP.1978.1170402
Filename :
1170402
Link To Document :
بازگشت