• DocumentCode
    394234
  • Title

    Unsupervised language model adaptation

  • Author

    Bacchiani, Michiel ; Roark, Brian

  • Author_Institution
    AT&T Labs.-Res., USA
  • Volume
    1
  • fYear
    2003
  • fDate
    6-10 April 2003
  • Abstract
    This paper investigates unsupervised language model adaptation, from ASR transcripts. N-gram counts from these transcripts can be used either to adapt an existing n-gram model or to build an n-gram model from scratch. Various experimental results are reported on a particular domain adaptation task, namely building a customer care application starting from a general voicemail transcription system. The experiments investigate the effectiveness of various adaptation strategies, including iterative adaptation and self-adaptation on the test data. They show an error rate reduction of 3.9% over the unadapted baseline performance, from 28% to 24.1%, using 17 hours of unsupervised adaptation material. This is 51% of the 7.7% adaptation gain obtained by supervised adaptation. Self-adaptation on the test data resulted in a 1.3% improvement over the baseline.
  • Keywords
    iterative methods; natural languages; speech recognition; voice mail; ASR transcripts; adaptation strategies; customer care application; domain adaptation task; error rate reduction; iterative adaptation; n-gram counts; n-gram model; self-adaptation; unadapted baseline performance; unsupervised adaptation material; unsupervised language model adaptation; voicemail transcription system; Acoustic testing; Adaptation model; Automatic speech recognition; Automatic testing; Error analysis; Speech recognition; System testing; Training data; Vocabulary; Voice mail;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7663-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.2003.1198758
  • Filename
    1198758