Restructuring Gaussian mixture density functions in speaker-independent acoustic models

Author

Nakamura, Atsushi

Author_Institution

ATR Interpreting Telephony Res. Labs., Kyoto, Japan

Volume

2

fYear

1998

fDate

12-15 May 1998

Firstpage

649

Abstract

In continuous speech recognition featuring hidden Markov model (HMM), word N-gram and time-synchronous beam search, a local modeling mismatch in the HMM will often cause the recognition performance to degrade. To cope with this problem, this paper proposes a method of restructuring Gaussian mixture PDFs in a pre-trained speaker-independent HMM based on speech data. In this method, mixture components are copied and shared among multiple mixture PDFs with the tendency of local errors taken into account. The tendency is given by comparing the pre-trained HMM and speech data which was used in the pre-training. Experimental results prove that the proposed method can effectively restore local modeling mismatches and improve the recognition performance

Keywords

Gaussian processes; acoustic signal processing; grammars; hidden Markov models; probability; search problems; speech recognition; Gaussian mixture PDF restructuring; Gaussian mixture density functions; continuous speech recognition; experimental results; hidden Markov model; local errors; local modeling mismatch; pre-trained speaker-independent HMM; recognition performance; speaker-independent acoustic models; speech data; time-synchronous beam search; word N-gram; Acoustic beams; Deformable models; Degradation; Hidden Markov models; Loudspeakers; Speech recognition; Viterbi algorithm;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on

Conference_Location

Seattle, WA

ISSN

1520-6149

Print_ISBN

0-7803-4428-6

Type

conf

DOI

10.1109/ICASSP.1998.675348

Filename

675348