Title :
Frame-correlated hidden Markov model based on extended logarithmic pool
Author :
Kim, Nam Soo ; Un, Chong Kwan
Author_Institution :
Human & Comput. Interaction Lab., Samsung Adv. Inst. of Technol., South Korea
fDate :
3/1/1997 12:00:00 AM
Abstract :
We present a novel method to incorporate temporal correlations into a speech recognition system based on conventional hidden Markov models (HMMs). The temporal correlations are considered to be useful for recognition because of the fact that the speech features of the present frame are highly informative about the feature characteristics of neighboring frames. In this paper, by treating these correlations in the form of conditional probability distributions (PDs), we propose a new technique for incorporating frame correlations. With the proposed method called the extended logarithmic pool (ELP), we approximate a joint conditional PD by separate conditional PDs associated with respective conditions. We provide a constrained optimization algorithm with which we can find the optimal value for the pooling weights. For practical purposes, we also suggest methods to get robust PD estimates for characterizing frame correlation. In addition, to improve model discriminability, a technique to combine two kinds of PDs through the exponents is introduced. The results in the experiments of speaker-independent continuous speech recognition with the proposed approaches show error reduction up to 20.5% as compared to that with the conventional bigram-constrained (BC) HMM method
Keywords :
correlation methods; hidden Markov models; optimisation; probability; speech recognition; conditional probability distributions; constrained optimization algorithm; error reduction; extended logarithmic pool; feature characteristics; frame-correlated hidden Markov model; pooling weights; speaker-independent continuous speech recognition; speech features; speech recognition system; temporal correlations; Character recognition; Constraint optimization; Data mining; Feature extraction; Hidden Markov models; Humans; Laboratories; Probability distribution; Robustness; Speech recognition;
Journal_Title :
Speech and Audio Processing, IEEE Transactions on