• DocumentCode
    3245920
  • Title

    A state-space method for language modeling

  • Author

    Siivola, Vesa ; Honkela, Antti

  • Author_Institution
    Neural Networks Res. Centre, Helsinki Univ. of Technol., Espoo, Finland
  • fYear
    2003
  • fDate
    30 Nov.-3 Dec. 2003
  • Firstpage
    548
  • Lastpage
    553
  • Abstract
    A new state-space method for language modeling is presented. The complexity of the model is controlled by choosing the dimension of the state instead of the smoothing and back-off methods common in n-gram modeling. The model complexity also controls the generalization ability of the model, allowing the model to handle similar words in a similar manner. We compare the state-space model to a traditional n-gram model in a task of letter prediction. In this proof-of-concept experiment, the state-space model gives similar results as the n-gram model with sparse training data, but performs clearly worse with dense training data. While the initial results are encouraging, the training algorithm should be made more effective, so that it can fully exploit the model structure and scale up to larger token sets, such as words.
  • Keywords
    computational complexity; natural languages; speech recognition; state-space methods; complexity; dense training data; language modeling; letter prediction; n-gram model; sparse training data; speech recognition; state-space method; Frequency; History; Natural languages; Neural networks; Probability distribution; Smoothing methods; Speech recognition; State-space methods; Training data; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
  • Print_ISBN
    0-7803-7980-2
  • Type

    conf

  • DOI
    10.1109/ASRU.2003.1318499
  • Filename
    1318499