DocumentCode :
2800378
Title :
The 2009 IBM GALE Mandarin broadcast transcription system
Author :
Chu, Stephen M. ; Povey, Daniel ; Kuo, Hong-Kwang ; Mangu, Lidia ; Zhang, Shilei ; Shi, Qin ; Qin, Yong
Author_Institution :
IBM T. J. Watson Res. Center, Yorktown Heights, NY, USA
fYear :
2010
fDate :
14-19 March 2010
Firstpage :
4374
Lastpage :
4377
Abstract :
This paper gives an up-to-date description of the IBM Mandarin broadcast transcription system developed under the DARPA GALE program. Technical advances over our previous system include a novel acoustic modeling approach using subspace Gaussian mixture models, a speaking rate adaptation method using frame rate normalization, and an effective recipe for lattice combination. We present results on three consortium-defined test sets. It is shown that with these advances, the new system attains a 9% relative reduction in character error rate compared to our previous GALE evaluation system. The reported 9.1% error rate on the phase three evaluation set represents the state of the art in Mandarin broadcast speech transcription.
Keywords :
Gaussian processes; language translation; natural language processing; DARPA GALE program; IBM GALE Mandarin broadcast transcription system; Mandarin broadcast speech transcription; acoustic modeling approach; consortium defined test sets; frame rate normalization; subspace Gaussian mixture models; Automatic speech recognition; Broadcasting; Buildings; Delay; Error analysis; Lattices; Maximum likelihood decoding; Natural languages; Performance gain; Speech recognition; CFRN; UBM; speaking rate adaptation; speech recognition; subspace GMM;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
ISSN :
1520-6149
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2010.5495639
Filename :
5495639
Link To Document :
بازگشت