Analysis of a simple bipos language model-attempt at a strategy to improve language models for speech recognition

Author

Ueberla, J.P.

Author_Institution

Sch. of Comput. Sci., Simon Fraser Univ., Burnaby, BC, Canada

fYear

1993

fDate

22-23 Apr 1993

Abstract

A speech recognizer has to choose, at each point in the utterance, the words among all the words in the vocabulary, that are the most likely. To that end, it uses an acoustic model and a language model and the author focuses on the language model. The bipos model is presented and analysed. A method is introduced called probability decomposition to measure which part of the model is performing particularly well or poorly. Based on this analysis, the author modifies the modeling of unknown words and this leads to a reduction in the entropy of at least 14% (up to 21%). Other conclusions obtained from the analysis are also given. An attempt at a strategy to improve language models in general is given. To that end, the author defines a class of models called state language models. This class contains most currently employed models. However, these currently used models cover only a small area in the space of all possible state language models. A more systematic study of this space is proposed in order to improve current language models. A statistical method, called classification and regression trees is presented as a tool for this purpose

Keywords

computational linguistics; probability; speech recognition; bipos model; classification; entropy; probability decomposition; regression trees; speech recognizer; state language models; statistical method; systematic study; unknown words;

fLanguage

English

Publisher

iet

Conference_Titel

Grammatical Inference: Theory, Applications and Alternatives, IEE Colloquium on

Conference_Location

Colchester

Type

conf

Filename

243119