Title :
Just-in-time language modelling
Author :
Berger, Adam ; Miller, Robert
Author_Institution :
Sch. of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA
Abstract :
Traditional approaches to language modelling have relied on a fixed corpus of text to inform the parameters of a probability distribution over word sequences. Increasing the corpus size often leads to better-performing language models, but no matter how large, the corpus is a static entity, unable to reflect information about events which postdate it. We introduce an online paradigm which interleaves the estimation and application of a language model. We present a Bayesian approach to online language modelling, in which the marginal probabilities of a static trigram model are dynamically updated to match the topic being dictated to the system. We also describe the architecture of a prototype we have implemented which uses the World Wide Web (WWW) as a source of information, and provide results from some initial proof of concept experiments
Keywords :
Bayes methods; handicapped aids; hearing; natural languages; online operation; probability; speech processing; speech recognition; Bayesian approach; WWW; World Wide Web; architecture; automatic speech recognition; corpus size; estimation; experiments; hearing-impaired; just-in-time language modelling; marginal probabilities; online language modelling; online paradigm; probability distribution; static trigram model; text corpus; transcript generation; word sequences; Automatic speech recognition; Bayesian methods; Computer science; Concrete; Information resources; Pressing; Prototypes; Service oriented architecture; Speech recognition; Web sites;
Conference_Titel :
Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
Conference_Location :
Seattle, WA
Print_ISBN :
0-7803-4428-6
DOI :
10.1109/ICASSP.1998.675362