Title :
Contextual vector quantization for speech recognition with discrete hidden Markov model
Author :
Huo, Qiang ; Chan, Chorkin
Author_Institution :
Dept. of Comput. Sci., Hong Kong Univ., Hong Kong
Abstract :
By using formulation of the finite mixture distribution identification, several alternatives to the conventional LBG VQ method are investigated. A contextual VQ method based on the Markov random field (MRF) theory is proposed to model the speech feature vector space. Its superiority is confirmed by a series of comparative experiments in a speaker independent isolated word recognition task by using different VQ schemes as the front-end of DHMM. The VQ schemes studied include the LBG VQ, the classification maximum likelihood (CML) approach, the mixture maximum likelihood (MML) procedure, the ergodic large HMM (LHMM) and the contextual VQ (CVQ) method. The motivation to use the MRF to model the contextual dependence information in the underlying speech production process can be readily extended to acoustic modeling of the basic speech units in speech recognition
Keywords :
acoustic signal processing; hidden Markov models; maximum likelihood estimation; random processes; speech coding; speech recognition; vector quantisation; LBG VQ method; Markov random field theory; VQ codebook; acoustic modeling; classification maximum likelihood; contextual dependence information; contextual vector quantization; discrete hidden Markov model; ergodic large HMM; experiments; finite mixture distribution identification; mixture maximum likelihood; speaker independent isolated word recognition; speech feature vector space; speech production process; speech recognition; speech units; Acoustic distortion; Clustering algorithms; Computer science; Context modeling; Distortion measurement; Hidden Markov models; Markov random fields; Speech processing; Speech recognition; Vector quantization;
Conference_Titel :
Speech, Image Processing and Neural Networks, 1994. Proceedings, ISSIPNN '94., 1994 International Symposium on
Print_ISBN :
0-7803-1865-X
DOI :
10.1109/SIPNN.1994.344816