Title :
Reliable accent specific unit generation with dynamic Gaussian mixture selection for multi-accent speech recognition
Author :
Zhang, Chao ; Liu, Yi ; Xia, Yunqing ; Zheng, Thomas Fang ; Olsen, Jesper ; Tian, Jilei
Author_Institution :
Center for Speech and Language Technologies, Division of Technology Innovation and Development, Tsinghua National Laboratory for Information Science and Technology, Beijing, China
Abstract :
Multiple accents are often present in Mandarin speech, as most Chinese have learned Mandarin as a second language. We propose generating reliable accent specific unit together with dynamic Gaussian mixture selection for multi-accent speech recognition. Time alignment phoneme recognition is used to generate such unit and to model accent variations explicitly and accurately. Dynamic Gaussian mixture selection scheme builds a dynamical observation density for each specified frame in decoding, and leads to use Gaussian mixture component efficiently. This method increases the covering ability for a diversity of accent variations in multi-accent, and alleviates the performance degradation caused by pruned beam search without augmenting the model size. The effectiveness of this approach is evaluated on three typical Chinese accents Chuan, Yue and Wu. Our approach outperforms traditional acoustic model reconstruction approach significantly by 6.30%, 4.93% and 5.53%, respectively on Syllable Error Rate (SER) reduction, without degrading on standard speech.
Keywords :
Dynamic Gaussian Mixture Selection Scheme; Multiple Accents; Reliable Accent Specific Unit;
Conference_Titel :
Multimedia and Expo (ICME), 2011 IEEE International Conference on
Conference_Location :
Barcelona, Spain
Print_ISBN :
978-1-61284-348-3
Electronic_ISBN :
1945-7871
DOI :
10.1109/ICME.2011.6011941