Frame-correlated hidden Markov model based on extended logarithmic pool

Author

Kim, Nam Soo ; Un, Chong Kwan

Author_Institution

Human & Comput. Interaction Lab., Samsung Adv. Inst. of Technol., South Korea

Volume

5

Issue

2

fYear

1997

fDate

3/1/1997 12:00:00 AM

Firstpage

149

Lastpage

160

Abstract

We present a novel method to incorporate temporal correlations into a speech recognition system based on conventional hidden Markov models (HMMs). The temporal correlations are considered to be useful for recognition because of the fact that the speech features of the present frame are highly informative about the feature characteristics of neighboring frames. In this paper, by treating these correlations in the form of conditional probability distributions (PDs), we propose a new technique for incorporating frame correlations. With the proposed method called the extended logarithmic pool (ELP), we approximate a joint conditional PD by separate conditional PDs associated with respective conditions. We provide a constrained optimization algorithm with which we can find the optimal value for the pooling weights. For practical purposes, we also suggest methods to get robust PD estimates for characterizing frame correlation. In addition, to improve model discriminability, a technique to combine two kinds of PDs through the exponents is introduced. The results in the experiments of speaker-independent continuous speech recognition with the proposed approaches show error reduction up to 20.5% as compared to that with the conventional bigram-constrained (BC) HMM method

Keywords

correlation methods; hidden Markov models; optimisation; probability; speech recognition; conditional probability distributions; constrained optimization algorithm; error reduction; extended logarithmic pool; feature characteristics; frame-correlated hidden Markov model; pooling weights; speaker-independent continuous speech recognition; speech features; speech recognition system; temporal correlations; Character recognition; Constraint optimization; Data mining; Feature extraction; Hidden Markov models; Humans; Laboratories; Probability distribution; Robustness; Speech recognition;

fLanguage

English

Journal_Title

Speech and Audio Processing, IEEE Transactions on

Publisher

ieee

ISSN

1063-6676

Type

jour

DOI

10.1109/89.554777

Filename

554777