• DocumentCode
    104545
  • Title

    An Unsupervised Adaptation Approach to Leveraging Feedback Loop Data by Using i-Vector for Data Clustering and Selection

  • Author

    Jian Xu ; Zhi-Jie Yan ; Qiang Huo

  • Author_Institution
    Dept. of Autom., Univ. of Sci. & Technol. of China, Hefei, China
  • Volume
    22
  • Issue
    11
  • fYear
    2014
  • fDate
    Nov. 2014
  • Firstpage
    1581
  • Lastpage
    1589
  • Abstract
    We present a study of using unsupervised adaptation approaches to improve speech recognition accuracy of a deployed speech service by leveraging large-scale untranscribed speech data collected from a feedback loop (FBL). For a regular user with lots of adaptation utterances, conventional CMLLR-based adaptation can be used for personalization directly. For a casual user with a few adaptation utterances, we propose to use CMLLR-based adaptation by augmenting his / her adaptation utterances with utterances acoustically close to the user, which are selected from the FBL data by an i-vector based approach. For a new user, we propose to perform a CMLLR-based recognition of an unknown utterance by selecting a set of CMLLR transforms from the most similar cluster, which are pre-trained by using the utterances from the corresponding cluster generated by an i-vector based utterance clustering method from the FBL data. The effectiveness of the above approaches are confirmed by our experiments on a short message dictation task on smart phones.
  • Keywords
    feedback; maximum likelihood estimation; regression analysis; speech recognition; CMLLR-based adaptation; CMLLR-based recognition; adaptation utterances; data clustering; data selection; deployed speech service; feedback loop data; i-vector based utterance clustering method; large-scale untranscribed speech data; short message dictation task; smart phones; speech recognition accuracy; unknown utterance; unsupervised adaptation approach; Acoustics; Data mining; Runtime; Speech; Speech processing; Speech recognition; Transforms; Data augmentation; data clustering; feedback loop; i-vector; personalization; unsupervised adaptation;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    2329-9290
  • Type

    jour

  • DOI
    10.1109/TASLP.2014.2341911
  • Filename
    6861990