Title :
On the bias of the Turing-Good estimate of probabilities
Author :
Juang, B.H. ; Lo, S.H.
Author_Institution :
AT&T Bell Labs., Murray Hill, NJ, USA
fDate :
2/1/1994 12:00:00 AM
Abstract :
Good´s (1953) estimate, based on Turing´s formula, was suggested for estimating the probabilities of words in text as well as of species in a mixed population and was found particularly useful for the probability of unseen classes. The authors address the issue of bias in Good´s estimate and propose an alternative to reduce this bias. This may be important in the construction of a language model for speech recognition where sparse data and low probability events are key problems
Keywords :
estimation theory; probability; speech recognition; Turing-Good probability estimate; bias; language model; low probability events; mixed population; sparse data; species; speech recognition; text; words; Bayesian methods; Maximum likelihood estimation; Natural languages; Probability; Speech recognition; Text analysis; Tin; Vocabulary;
Journal_Title :
Signal Processing, IEEE Transactions on