DocumentCode :
1253775
Title :
Maximum likelihood multiple subspace projections for hidden Markov models
Author :
Gales, Mark J F
Author_Institution :
IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
Volume :
10
Issue :
2
fYear :
2002
fDate :
2/1/2002 12:00:00 AM
Firstpage :
37
Lastpage :
47
Abstract :
The first stage in many pattern recognition tasks is to generate a good set of features from the observed data. Usually, only a single feature space is used. However, in some complex pattern recognition tasks the choice of a good feature space may vary depending on the signal content. An example is in speech recognition where phone dependent feature subspaces may be useful. Handling multiple subspaces while still maintaining meaningful likelihood comparisons between classes is a key issue. This paper describes two new forms of multiple subspace schemes. For both schemes, the problem of handling likelihood consistency between the various subspaces is dealt with by viewing the projection schemes within a maximum likelihood framework. Efficient estimation formulae for the model parameters for both schemes are derived. In addition, the computational cost for their use during recognition are given. These new projection schemes are evaluated on a large vocabulary speech recognition task in terms of performance, speed of likelihood calculation and number parameters
Keywords :
hidden Markov models; maximum likelihood estimation; speech recognition; computational cost; estimation formulae; feature space; hidden Markov models; likelihood consistency; maximum likelihood framework; maximum likelihood multiple subspace projections; multiple subspaces; pattern recognition tasks; phone dependent feature subspaces; projection schemes; signal content; speech recognition; Computational efficiency; Helium; Hidden Markov models; Linear discriminant analysis; Maximum likelihood estimation; Parameter estimation; Pattern recognition; Principal component analysis; Speech recognition; Vocabulary;
fLanguage :
English
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1063-6676
Type :
jour
DOI :
10.1109/89.985541
Filename :
985541
Link To Document :
بازگشت