DocumentCode
104545
Title
An Unsupervised Adaptation Approach to Leveraging Feedback Loop Data by Using i-Vector for Data Clustering and Selection
Author
Jian Xu ; Zhi-Jie Yan ; Qiang Huo
Author_Institution
Dept. of Autom., Univ. of Sci. & Technol. of China, Hefei, China
Volume
22
Issue
11
fYear
2014
fDate
Nov. 2014
Firstpage
1581
Lastpage
1589
Abstract
We present a study of using unsupervised adaptation approaches to improve speech recognition accuracy of a deployed speech service by leveraging large-scale untranscribed speech data collected from a feedback loop (FBL). For a regular user with lots of adaptation utterances, conventional CMLLR-based adaptation can be used for personalization directly. For a casual user with a few adaptation utterances, we propose to use CMLLR-based adaptation by augmenting his / her adaptation utterances with utterances acoustically close to the user, which are selected from the FBL data by an i-vector based approach. For a new user, we propose to perform a CMLLR-based recognition of an unknown utterance by selecting a set of CMLLR transforms from the most similar cluster, which are pre-trained by using the utterances from the corresponding cluster generated by an i-vector based utterance clustering method from the FBL data. The effectiveness of the above approaches are confirmed by our experiments on a short message dictation task on smart phones.
Keywords
feedback; maximum likelihood estimation; regression analysis; speech recognition; CMLLR-based adaptation; CMLLR-based recognition; adaptation utterances; data clustering; data selection; deployed speech service; feedback loop data; i-vector based utterance clustering method; large-scale untranscribed speech data; short message dictation task; smart phones; speech recognition accuracy; unknown utterance; unsupervised adaptation approach; Acoustics; Data mining; Runtime; Speech; Speech processing; Speech recognition; Transforms; Data augmentation; data clustering; feedback loop; i-vector; personalization; unsupervised adaptation;
fLanguage
English
Journal_Title
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
Publisher
ieee
ISSN
2329-9290
Type
jour
DOI
10.1109/TASLP.2014.2341911
Filename
6861990
Link To Document