• DocumentCode
    2180159
  • Title

    A sampling-based environment population projection approach for rapid acoustic model adaptation

  • Author

    Tsao, Yu ; Matsuda, Shigeki ; Sakai, Shinsuke ; Isotani, Ryosuke ; Kawai, Hisashi ; Nakamura, Satoshi

  • Author_Institution
    Spoken Language Commun. Group, Nat. Inst. of Inf. & Commun. Technol., Kyoto, Japan
  • fYear
    2011
  • fDate
    22-27 May 2011
  • Firstpage
    5504
  • Lastpage
    5507
  • Abstract
    We propose an environment population projection (EPP) approach for rapid acoustic model adaptation to reduce environment mismatches with limited amounts of adaptation data. This approach consists of two stages: population construction and projection. In the population construction stage, we apply a sampling scheme on the adaptation data to construct an environment population based on acoustic models prepared in the training phase. With this sampling procedure, the environment samples in the population characterize diverse acoustic information embedded in the adaptation data. Next, the projection stage estimates a function to map the environment population into one set of acoustic models that matches the testing condition. With a well constructed environment population, a simple projection function can enable the EPP approach to accurately characterize the testing environment even with a small amount of adaptation data. To examine the rapid adaptation ability of EPP, we used only one adaptation utterance and tested performance in both supervised and unsupervised adaptation modes on Aurora-2 and Aurora-2J tasks. It is found that EPP achieves satisfactory performance under both modes for both tasks. On the Aurora-2J task for example, EPP gives a clear improvement of a 13.87% (8.58% to 7.39%) word error rate (WER) reduction over our baseline in the unsupervised adaptation mode.
  • Keywords
    sampling methods; speech enhancement; speech recognition; ASR; Aurora-2 task; Aurora-2J task; automatic speech recognition; diverse acoustic information; environment mismatch reduction; population construction stage; population projection stage; rapid acoustic model adaptation; sampling-based EPP approach; sampling-based environment population projection approach; speech enhancement; Acoustics; Adaptation models; Hidden Markov models; Signal to noise ratio; Speech; Testing; Training; Stochastic matching; acoustic model adaptation; ensemble classification; environment population projection;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
  • Conference_Location
    Prague
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4577-0538-0
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2011.5947605
  • Filename
    5947605