DocumentCode
2254047
Title
Introducing linguistic constraints into statistical language modeling
Author
Geutner, Petra
Author_Institution
Karlsruhe Univ., Germany
Volume
1
fYear
1996
fDate
3-6 Oct 1996
Firstpage
402
Abstract
Building robust stochastic language models is a major issue in speech recognition systems. Conventional word-based n-gram models do not capture any linguistic constraints inherent in speech. In this paper, the notion of function and content words (open/closed word classes) is used to provide linguistic knowledge that can be incorporated into language models. Function words are articles, prepositions and personal pronouns. Content words are nouns, verbs, adjectives and adverbs. Based on this class definition resulting in function and content word markers, a new language model is defined. A combination of the word-based model with this new model is introduced. The combined model shows modest improvements both in perplexity results and recognition performance
Keywords
grammars; linguistics; natural languages; speech recognition; statistics; stochastic processes; adjectives; adverbs; articles; closed word classes; content words; function words; linguistic constraints; nouns; open word classes; perplexity; personal pronouns; prepositions; recognition performance; robust stochastic language models; speech recognition systems; statistical language modeling; verbs; word markers; word-based n-gram models; Databases; History; Interactive systems; Laboratories; Natural languages; Predictive models; Robustness; Speech recognition; Stochastic processes; Stochastic systems;
fLanguage
English
Publisher
ieee
Conference_Titel
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location
Philadelphia, PA
Print_ISBN
0-7803-3555-4
Type
conf
DOI
10.1109/ICSLP.1996.607139
Filename
607139
Link To Document