مرکز منطقه ای اطلاع رساني علوم و فناوري - Selective MCE training strategy in Mandarin speech recognition

DocumentCode :

389252

Title :

Selective MCE training strategy in Mandarin speech recognition

Author :

Jian-Min Zha ; Zhu, Xin-zhong ; Xu, Hui-Ying

Author_Institution :

Inst. of Comput. Sci. Studies, Zhejiang Normal Univ., China

Volume :

fYear :

2002

fDate :

2002

Firstpage :

679

Abstract :

Minimum classification error (MCE) based discriminative methods have been extensively studied and successfully applied to automatic speech recognition, speaker recognition, and utterance verification. Our goal is to modify the embedded string model based MCE algorithm to handle the case of a large number of cross-syllable trip hones used in a large vocabulary recognition system. A selective strategy about MCE based discriminative training method, in particular for Mandarin speech syllable loop recognition, is introduced. We use a syllable loop recognition task to evaluate the performance of the acoustic model of an established large vocabulary continuous speech recognition system. The basic idea is that since decoding errors occur in parts of the models in whole decoded sentences, it is reasonable to adjust the parameters of the "wrong models". As a result, a weighted MCE formulation is derived, which can provide more effective convergence and about a 10% error rate reduction for a large training set. Our experiments show that, although the performance of the recognition system is improved, some original correct recognition results are misrecognized after discriminative training, and a divide and conquer strategy is proposed as a solution. Thus, the acoustic feature space is divided into two or more sub-spaces according to the discriminative training procedure. By combining these two methods, we obtain over 14.5% error rate reduction in syllable loop recognition.

Keywords :

convergence; divide and conquer methods; error statistics; signal classification; speech coding; speech recognition; Mandarin speech recognition; Mandarin speech syllable loop recognition; acoustic feature space; acoustic model; convergence; cross-syllable trip hones; decoding errors; discriminative training; divide and conquer strategy; embedded string model; error probability; error rate reduction; large vocabulary continuous speech recognition system; large vocabulary recognition system; minimum classification error based discriminative methods; selective MCE training strategy; selective strategy; Automatic speech recognition; Computer errors; Computer science; Educational institutions; Error analysis; Maximum likelihood decoding; Maximum likelihood estimation; Speech recognition; Viterbi algorithm; Vocabulary;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Machine Learning and Cybernetics, 2002. Proceedings. 2002 International Conference on

Print_ISBN :

0-7803-7508-4

Type :

conf

DOI :

10.1109/ICMLC.2002.1174423

Filename :

1174423

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=389252