Title :
Application of variational Bayesian PCA for speech feature extraction
Author :
Kwon, Oh-Wook ; Lee, Te-Won ; Chan, Kwokleung
Author_Institution :
Institute for Neural Computation, University of California, San Diego, 8500 Oilman Drive, La Jolla, 92059-0523, USA
Abstract :
In a standard mel-frequency cepstral coefficient-based speech recognizer, it is common to use the same feature dimension and the number of Gaussian mixtures for all subunits. We proposed to use different transformations and different number of mixtures for each subunit. We obtained the transformations from mel-frequency band energies by using the variational Bayesian principal component analysis (PCA) method. In the method, hyperparameters of the Gaussian mixtures and the number of mixtures are automatically learned through maximization of a lower bound of the evidence instead of the likelihood in the conventional maximum likelihood paradigm. Analyzing the TIMIT speech data, we revealed intrinsic structures of vowels and consonants. We demonstrated the userfulness of the method for speech recognition by performing phoneme classification of /b/, /d/ and /g/ phonemes.
Keywords :
Bayesian methods; Feature extraction; Hidden Markov models; Principal component analysis; Speech; Speech recognition; Transforms;
Conference_Titel :
Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on
Conference_Location :
Orlando, FL, USA
Print_ISBN :
0-7803-7402-9
DOI :
10.1109/ICASSP.2002.5743866