• DocumentCode
    284668
  • Title

    An automatic technique to include grammatical and morphological information in a trigram-based statistical language model

  • Author

    Maltese, G. ; Mancini, F.

  • Author_Institution
    IBM Semea Rome Sci. Center, Italy
  • Volume
    1
  • fYear
    1992
  • fDate
    23-26 Mar 1992
  • Firstpage
    157
  • Abstract
    A technique to take into account grammatical and morphological information in a trigram-based statistical language model is presented. This is automatically achieved by interpolating the trigram model (which uses sequences of words) with statistical models based on sequences of grammatical categories and/or lemmas. Such an approach reduces the effect of data sparseness in the trigram model due also to the way interpolation coefficients are chosen. With respect to trigrams, the authors obtained a significant reduction in perplexity on various texts even when combining a well-trained trigram model with a small grammatical/morphological model
  • Keywords
    grammars; interpolation; speech recognition; statistical analysis; automatic speech recognition; grammatical/morphological model; interpolation coefficients; morphological information; probability; statistical language model; trigram model; Automatic speech recognition; Frequency estimation; Interpolation; Loudspeakers; Natural languages; Probability; Smoothing methods; Speech recognition; Statistics; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1992. ICASSP-92., 1992 IEEE International Conference on
  • Conference_Location
    San Francisco, CA
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-0532-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.1992.225948
  • Filename
    225948