• DocumentCode
    2254047
  • Title

    Introducing linguistic constraints into statistical language modeling

  • Author

    Geutner, Petra

  • Author_Institution
    Karlsruhe Univ., Germany
  • Volume
    1
  • fYear
    1996
  • fDate
    3-6 Oct 1996
  • Firstpage
    402
  • Abstract
    Building robust stochastic language models is a major issue in speech recognition systems. Conventional word-based n-gram models do not capture any linguistic constraints inherent in speech. In this paper, the notion of function and content words (open/closed word classes) is used to provide linguistic knowledge that can be incorporated into language models. Function words are articles, prepositions and personal pronouns. Content words are nouns, verbs, adjectives and adverbs. Based on this class definition resulting in function and content word markers, a new language model is defined. A combination of the word-based model with this new model is introduced. The combined model shows modest improvements both in perplexity results and recognition performance
  • Keywords
    grammars; linguistics; natural languages; speech recognition; statistics; stochastic processes; adjectives; adverbs; articles; closed word classes; content words; function words; linguistic constraints; nouns; open word classes; perplexity; personal pronouns; prepositions; recognition performance; robust stochastic language models; speech recognition systems; statistical language modeling; verbs; word markers; word-based n-gram models; Databases; History; Interactive systems; Laboratories; Natural languages; Predictive models; Robustness; Speech recognition; Stochastic processes; Stochastic systems;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
  • Conference_Location
    Philadelphia, PA
  • Print_ISBN
    0-7803-3555-4
  • Type

    conf

  • DOI
    10.1109/ICSLP.1996.607139
  • Filename
    607139