Title :
Discriminative HMM stream model for Mandarin digit string speech recognition
Author :
Shi, Yuan-yuan ; Liu, Jia ; Liu, Run-sheng
Author_Institution :
Dept. of Electron. Eng., Tsinghua Univ., Beijing, China
Abstract :
The conventional hidden Markov model (HMM) only based on the spectral features does not have a high recognition performance for connected Mandarin digits, because highly confusable syllables exist. The main problems of Mandarin digit recognition are analyzed. It is revealed that to establish the precise classification models for Mandarin digits not only features extracted from the spectrum, energy and pitch contour are necessary but also they should be used with different emphases for different digits. So each-type of feature is used to train a single-stream HMM by maximum likelihood. Then a multi-stream HMM is obtained by combining the single-stream HMMs with exponents that weigh the log-likelihood of each stream. The exponents are estimated by means of the generalized probabilistic descent algorithm according to the digit minimum classification error rate criteria. The superiority of the multi-stream HMM is demonstrated: the relative string error rate is reduced by 54.5%. And the unknown length digit string error rate and its digit error rate decrease to 4.66% and 1.31% respectively.
Keywords :
error statistics; hidden Markov models; maximum likelihood estimation; pattern classification; speech recognition; Mandarin digit string speech recognition; classification; confusable syllables; digit minimum classification error rate criteria; discriminative HMM stream model; generalized probabilistic descent algorithm; hidden Markov model; log-likelihood; maximum likelihood; multi-stream HMM; pitch contour; recognition performance; relative string error rate; spectral features; Error analysis; Feature extraction; Hidden Markov models; Humans; Maximum likelihood estimation; Speech recognition; Telephony; Viterbi algorithm;
Conference_Titel :
Signal Processing, 2002 6th International Conference on
Print_ISBN :
0-7803-7488-6
DOI :
10.1109/ICOSP.2002.1181109