Estimation of probabilities in the language model of the IBM speech recognition system

Author

NÁdas, Arthur

Author_Institution

IBM T.J. Watson Research Center, Yorktown Heights, NY

Volume

Issue

fYear

1984

fDate

8/1/1984 12:00:00 AM

Firstpage

859

Lastpage

861

Abstract

The language model probabilities are estimated by an empirical Bayes approach in which a prior distribution for the unknown probabilities is itself estimated through a novel choice of data. The predictive power of the model thus fitted is compared by means of its experimental perplexity [1] to the model as fitted by the Jelinek-Mercer deleted estimator and as fitted by the Turing-Good formulas for probabilities of unseen or rarely seen events.

Keywords

Bayesian methods; Cities and towns; Helium; Natural languages; Power system modeling; Predictive models; Probability; Smoothing methods; Speech recognition; Vocabulary;

fLanguage

English

Journal_Title

Acoustics, Speech and Signal Processing, IEEE Transactions on

Publisher

ieee

ISSN

0096-3518

Type

jour

DOI

10.1109/TASSP.1984.1164378

Filename

1164378

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=1103789