• DocumentCode
    1260751
  • Title

    Low Rank Language Models for Small Training Sets

  • Author

    Hutchinson, Brian ; Ostendorf, Mari ; Fazel, Maryam

  • Author_Institution
    Dept. of Electr. Eng., Univ. of Washington, Seattle, WA, USA
  • Volume
    18
  • Issue
    9
  • fYear
    2011
  • Firstpage
    489
  • Lastpage
    492
  • Abstract
    Several language model smoothing techniques are available that are effective for a variety of tasks; however, training with small data sets is still difficult. This letter introduces the low rank language model, which uses a low rank tensor representation of joint probability distributions for parameter-tying and optimizes likelihood under a rank constraint. It obtains lower perplexity than standard smoothing techniques when the training set is small and also leads to perplexity reduction when used in domain adaptation via interpolation with a general, out-of-domain model.
  • Keywords
    computational linguistics; smoothing methods; statistical distributions; interpolation; joint probability distribution; language model smoothing technique; low rank language model; low rank tensor representation; lower perplexity; parameter tying; perplexity reduction; rank constraint; standard smoothing techniques; training set; Complexity theory; Data models; Joints; Smoothing methods; Tensile stress; Training; Vocabulary; Language model; low rank tensor;
  • fLanguage
    English
  • Journal_Title
    Signal Processing Letters, IEEE
  • Publisher
    ieee
  • ISSN
    1070-9908
  • Type

    jour

  • DOI
    10.1109/LSP.2011.2160850
  • Filename
    5934582