Large margin HMMs for speech recognition

Author

Li, Xinwei ; Jiang, Hui ; Liu, Chaojun

Author_Institution

Dept. of Comput. Sci. & Eng., York Univ., Toronto, Ont., Canada

Volume

5

fYear

2005

fDate

18-23 March 2005

Abstract

Motivated by large margin classifiers in machine learning, we propose a novel method to estimate a continuous density hidden Markov model (CDHMM) in speech recognition according to the principle of maximizing the minimum multi-class separation margin. The approach is named large margin HMM. First, we show that this type of large margin HMM estimation problem can be formulated as a standard constrained minimax optimization problem. Second, we propose an iterative localized optimization approach to perform the minimax optimization for one model at a time to guarantee that the optimal value of the objective function always exists in the course of model parameter optimization. Then, we show that during each step the optimization can be solved by the GPD (generalized probabilistic descent) algorithm if we approximate the objective function by a differentiable function, such as summation of exponential functions. The large margin HMM-based classifiers are evaluated in a speaker-independent E-set speech recognition task using the OGI ISOLET database. Experimental results show that the large margin HMMs can achieve significant word error rate (WER) reduction over conventional HMM training methods, such as maximum likelihood estimation (MLE) and minimum classification error (MCE) training.

Keywords

approximation theory; error statistics; functions; hidden Markov models; iterative methods; learning (artificial intelligence); minimax techniques; parameter estimation; speech recognition; HMM training methods; constrained minimax optimization; continuous density hidden Markov model; differentiable function; exponential functions; generalized probabilistic descent algorithm; iterative localized optimization approach; large margin HMM estimation; large margin classifiers; maximum likelihood estimation; minimum classification error; minimum multi-class separation margin maximization; model parameter optimization; objective function; speech recognition; word error rate; Boosting; Databases; Error analysis; Hidden Markov models; Machine learning; Maximum likelihood estimation; Minimax techniques; Pattern classification; Speech recognition; Support vector machines;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP '05). IEEE International Conference on

ISSN

1520-6149

Print_ISBN

0-7803-8874-7

Type

conf

DOI

10.1109/ICASSP.2005.1416353

Filename

1416353