DocumentCode :
394304
Title :
Optimal clustering of multivariate normal distributions using divergence and its application to HMM adaptation
Author :
Myrvoll, Tor Andre ; Soong, Frank K.
Author_Institution :
Lucent Technol. Bell Labs., Murray Hill, NJ, USA
Volume :
1
fYear :
2003
fDate :
6-10 April 2003
Abstract :
We present an optimal clustering algorithm for grouping multivariate normal distributions into clusters using the divergence, a symmetric, information-theoretic distortion measure based on the Kullback-Liebler distance. Optimal solutions for normal distributions are shown to be obtained by solving a set of Riccati matrix equations and the optimal centroids are found by alternating the mean and covariance matrix intermediate solutions. The clustering performance of the new algorithm compared favorably against the conventional, non-optimal clustering solutions of sample mean and sample covariance in its overall rate-distortion and even distributions of samples across clusters. The resultant clusters were further tested on unsupervised adaptation of HMM parameters in a framework of structured maximum a posterior linear regression (SMAPLR). The Wall Street Journal database was used for the adaptation experiment. The recognition performance with respect to the word error rate, was significantly improved from a nonoptimal centroid (sample mean and covariance) of 32.6% to 27.6% and 27.5% for the diagonal and full covariance matrix cases, respectively.
Keywords :
hidden Markov models; maximum likelihood estimation; normal distribution; pattern clustering; rate distortion theory; speech recognition; HMM parameters; Kullback-Liebler distance; Riccati matrix equations; SMAPLR; clusters; covariance matrix; divergence; intermediate solutions; multivariate normal distributions; normal distributions; optimal centroids; optimal clustering algorithm; rate-distortion; recognition performance; structured maximum a posterior linear regression; symmetric information-theoretic distortion measure; unsupervised adaptation; word error rate; Clustering algorithms; Covariance matrix; Distortion measurement; Gaussian distribution; Hidden Markov models; Linear regression; Rate-distortion; Riccati equations; Symmetric matrices; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
ISSN :
1520-6149
Print_ISBN :
0-7803-7663-3
Type :
conf
DOI :
10.1109/ICASSP.2003.1198840
Filename :
1198840
Link To Document :
بازگشت