DocumentCode :
2546633
Title :
Improved acoustic models for Conversational Telephone Speech recognition
Author :
Zhang, Qingqing ; Cai, Shang ; Pan, Jielin ; Yan, Yonghong
Author_Institution :
Key Lab. of Speech Acoust. & Content Understanding, Beijing, China
fYear :
2012
fDate :
29-31 May 2012
Firstpage :
1229
Lastpage :
1232
Abstract :
This paper describes advances for acoustic models in Chinese spontaneous Conversational Telephone Speech (CTS) recognition task. A number of approaches were investigated in the acoustic modeling, including Heteroscedastic Linear Discriminant Analysis (HLDA), Vocal Tract Length Normalization (VTLN), Gaussianization, Minimum Phone Error (MPE), Feature space MPE (fMPE), and etc. Considering pronunciation variations in continuous speech, tones in recognition vocabulary were modified due to the Sandhi rule. The acoustic models were trained on over 200 hours of audio data from standard LDC corpora. The improved acoustic models reduce the relative Character Error Rate (CER) by about 25% over the baseline acoustic models on standard LDC test set and China 863 program evaluation data set.
Keywords :
speech recognition; statistical analysis; telephony; CER; CTS recognition task; China 863 program evaluation data set; Chinese spontaneous conversational telephone speech recognition task; Gaussianization; HLDA; Sandhi rule; VTLN; audio data; baseline acoustic models; character error rate; continuous speech; fMPE; feature space MPE; heteroscedastic linear discriminant analysis; improved acoustic models; minimum phone error; pronunciation variations; recognition vocabulary; standard LDC corpora; standard LDC test set; vocal tract length normalization; Acoustics; Data models; Feature extraction; Hidden Markov models; Speech; Speech recognition; Training; Acoustics modeling; CTS; speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2012 9th International Conference on
Conference_Location :
Sichuan
Print_ISBN :
978-1-4673-0025-4
Type :
conf
DOI :
10.1109/FSKD.2012.6234022
Filename :
6234022
Link To Document :
بازگشت