• DocumentCode
    1420403
  • Title

    A multispan language modeling framework for large vocabulary speech recognition

  • Author

    Bellegarda, Jerome R.

  • Author_Institution
    Spoken Language Group, Apple Comput. Inc., Cupertino, CA, USA
  • Volume
    6
  • Issue
    5
  • fYear
    1998
  • fDate
    9/1/1998 12:00:00 AM
  • Firstpage
    456
  • Lastpage
    467
  • Abstract
    A new framework is proposed to construct multispan language models for large vocabulary speech recognition, by exploiting both local and global constraints present in the language. While statistical n-gram modeling can readily take local constraints into account, global constraints have been more difficult to handle within a data-driven formalism. In this work, they are captured via a paradigm first formulated in the context of information retrieval, called latent semantic analysis (LSA). This paradigm seeks to automatically uncover the salient semantic relationships between words and documents in a given corpus. Such discovery relies on a parsimonious vector representation of each word and each document in a suitable, common vector space. Since in this space familiar clustering techniques can be applied, it becomes possible to derive several families of large-span language models, with various smoothing properties. Because of their semantic nature, the new language models are well suited to complement conventional, more syntactically oriented n-grams, and the combination of the two paradigms naturally yields the benefit of a multispan context. An integrative formulation is proposed for this purpose, in which the latent semantic information is used to adjust the standard n-gram probability. The performance of the resulting multispan language models, as measured by perplexity, compares favorably with the corresponding n-gram performance
  • Keywords
    grammars; information retrieval; natural languages; probability; speech recognition; statistical analysis; clustering techniques; data-driven formalism; documents; global constraints; information retrieval; integrative formulation; large vocabulary speech recognition; large-span language models; latent semantic analysis; local constraints; multispan language modeling; n-gram performance; n-gram probability; perplexity; semantic relationships; smoothing properties; statistical n-gram modeling; vector representation; vector space; words; Acoustic applications; Context modeling; Frequency estimation; Information analysis; Information retrieval; Natural languages; Probability; Smoothing methods; Speech recognition; Vocabulary;
  • fLanguage
    English
  • Journal_Title
    Speech and Audio Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1063-6676
  • Type

    jour

  • DOI
    10.1109/89.709671
  • Filename
    709671