• DocumentCode
    82016
  • Title

    A Sparse Plus Low-Rank Exponential Language Model for Limited Resource Scenarios

  • Author

    Hutchinson, Brian ; Ostendorf, Mari ; Fazel, Maryam

  • Author_Institution
    Comput. Sci. Dept., Western Washington Univ., Bellingham, WA, USA
  • Volume
    23
  • Issue
    3
  • fYear
    2015
  • fDate
    Mar-15
  • Firstpage
    494
  • Lastpage
    504
  • Abstract
    This paper describes a new exponential language model that decomposes the model parameters into one or more low-rank matrices that learn regularities in the training data and one or more sparse matrices that learn exceptions (e.g., keywords). The low-rank matrices induce continuous-space representations of words and histories. The sparse matrices learn multi-word lexical items and topic/domain idiosyncrasies. This model generalizes the standard ℓ1-regularized exponential language model, and has an efficient accelerated first-order training algorithm. Language modeling experiments show that the approach is useful in scenarios with limited training data, including low resource languages and domain adaptation.
  • Keywords
    computational linguistics; matrix algebra; natural language processing; optimisation; ℓ1-regularized exponential language model; limited resource scenario; low-rank matrix; model parameter decomposition; rank-penalized optimization; sparse matrix; Adaptation models; Data models; History; Matrix decomposition; Sparse matrices; Standards; Training; Language model; exponential; log bilinear; low-hyphen; sparse;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    2329-9290
  • Type

    jour

  • DOI
    10.1109/TASLP.2014.2379593
  • Filename
    7050385