Automatic speech recognition via pseudo-independent marginal mixtures

Author

Nadas, Andras ; Nahamoo, David

Author_Institution

IBM T. J. Watson Research Center, Yorktown Heights, NY

Volume

12

fYear

1987

fDate

31868

Firstpage

1285

Lastpage

1287

Abstract

Statistical models (prototypes) for the multivariate probability distribution of vectors (frames) of speech parameters may be utilized in various ways. If the stream of vectors is passed directly to the decoder of a continuous parameter speech recognizer then the prototypes are used by the decoder; if the recognizer has a time-synchronous labeling acoustic processor then they are used for vector quantization (labeling) and the resulting label stream is passed to the decoder; other uses are possible as well. We present a method for constructing such prototypes. This method was chosen as a compromise between describing a prototype in an assumption free way as a nonparametric density and describing it in a convenient way as a simple multivariate Gaussian density. We describe speech recognition experiments where our prototypes were trained by iteratively interleaving steps of a K-MEANS type algorithm for clustering and steps of an EM algorithm for reestimation. We present results (using a labeling acoustic processor) having significantly fewer decoding errors than our previous methods do.

Keywords

Automatic speech recognition; Clustering algorithms; Decoding; Iterative algorithms; Labeling; Probability distribution; Prototypes; Speech processing; Speech recognition; Vector quantization;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '87.

Type

conf

DOI

10.1109/ICASSP.1987.1169454

Filename

1169454