A fused hidden Markov model with application to bimodal speech processing

Author

Pan, Hao ; Levinson, Stephen E. ; Huang, Thomas S. ; Liang, Zhi-Pei

Author_Institution

Sharp Labs. of America Inc., Camas, WA, USA

Volume

52

Issue

3

fYear

2004

fDate

3/1/2004 12:00:00 AM

Firstpage

573

Lastpage

581

Abstract

This paper presents a novel fused hidden Markov model (fused HMM) for integrating tightly coupled time series, such as audio and visual features of speech. In this model, the time series are first modeled by two conventional HMMs separately. The resulting HMMs are then fused together using a probabilistic fusion model, which is optimal according to the maximum entropy principle and a maximum mutual information criterion. Simulations and bimodal speaker verification experiments show that the proposed model can significantly reduce the recognition errors in noiseless or noisy environments.

Keywords

hidden Markov models; maximum entropy methods; speaker recognition; speech processing; HMM; bimodal speaker verification; bimodal speech processing; coupled time series; fused hidden Markov model; information fusion; maximum entropy principle; maximum mutual information criterion; noisy environment; probabilistic fusion model; recognition errors reduction; speech audio features; speech visual features; Computer errors; Entropy; Hidden Markov models; Joining processes; Mutual information; Noise reduction; Signal processing; Signal processing algorithms; Speech processing; Working environment noise;

fLanguage

English

Journal_Title

Signal Processing, IEEE Transactions on

Publisher

ieee

ISSN

1053-587X

Type

jour

DOI

10.1109/TSP.2003.822353

Filename

1268351