Building a highly accurate Mandarin speech recognizer

Author

Hwang, Mei-Yuh ; Peng, Gang ; Wang, Wen ; Faria, Arlo ; Heidel, Aaron ; Ostendorf, Mari

Author_Institution

Univ. of Washington, Seattle

fYear

2007

fDate

9-13 Dec. 2007

Firstpage

490

Lastpage

495

Abstract

We describe a highly accurate large-vocabulary continuous Mandarin speech recognizer, a collaborative effort among four research organizations. Particularly, we build two acoustic models (AMs) with significant differences but similar accuracy for the purposes of cross adaptation and system combination. This paper elaborates on the main differences between the two systems, where one recognizer incorporates a discriminatively trained feature while the other utilizes a discriminative feature transformation. Additionally we present an improved acoustic segmentation algorithm and topic-based language model (LM) adaptation. Coupled with increased acoustic training data, we reduced the character error rate (CER) of the DARPA GALE 2006 evaluation set to 15.3% from 18.4%.

Keywords

acoustic signal processing; error statistics; speech recognition; vocabulary; CER; DARPA GALE 2006 evaluation; Mandarin speech recognizer; character error rate; discriminative feature transformation; highly accurate large-vocabulary; improved acoustic segmentation algorithm; topic-based language model; Acoustic testing; Automatic speech recognition; Broadcasting; Computer science; Error analysis; International collaboration; Maximum likelihood decoding; Speech recognition; System testing; Training data; LM adaptation; Mandarin; acoustic segmentation; character error rates; discriminative features; multi-layer perceptrons; out-of-vocabulary;

fLanguage

English

Publisher

ieee

Conference_Titel

Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on

Conference_Location

Kyoto

Print_ISBN

978-1-4244-1746-9

Electronic_ISBN

978-1-4244-1746-9

Type

conf

DOI

10.1109/ASRU.2007.4430161

Filename

4430161