Title :
The 2009 IBM GALE Mandarin broadcast transcription system
Author :
Chu, Stephen M. ; Povey, Daniel ; Kuo, Hong-Kwang ; Mangu, Lidia ; Zhang, Shilei ; Shi, Qin ; Qin, Yong
Author_Institution :
IBM T. J. Watson Res. Center, Yorktown Heights, NY, USA
Abstract :
This paper gives an up-to-date description of the IBM Mandarin broadcast transcription system developed under the DARPA GALE program. Technical advances over our previous system include a novel acoustic modeling approach using subspace Gaussian mixture models, a speaking rate adaptation method using frame rate normalization, and an effective recipe for lattice combination. We present results on three consortium-defined test sets. It is shown that with these advances, the new system attains a 9% relative reduction in character error rate compared to our previous GALE evaluation system. The reported 9.1% error rate on the phase three evaluation set represents the state of the art in Mandarin broadcast speech transcription.
Keywords :
Gaussian processes; language translation; natural language processing; DARPA GALE program; IBM GALE Mandarin broadcast transcription system; Mandarin broadcast speech transcription; acoustic modeling approach; consortium defined test sets; frame rate normalization; subspace Gaussian mixture models; Automatic speech recognition; Broadcasting; Buildings; Delay; Error analysis; Lattices; Maximum likelihood decoding; Natural languages; Performance gain; Speech recognition; CFRN; UBM; speaking rate adaptation; speech recognition; subspace GMM;
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2010.5495639