Title :
Dimensionality reduction methods for HMM phonetic recognition
Author :
Hu, Hongbing ; Zahorian, Stephen A.
Author_Institution :
Dept. of Electr. & Comput. Eng., Binghamton Univ., Binghamton, NY, USA
Abstract :
This paper presents two nonlinear feature dimensionality reduction methods based on neural networks for a HMM-based phone recognition system. The neural networks are trained as feature classifiers to reduce feature dimensionality as well as maximize discrimination among speech features. The outputs of different network layers are used for obtaining transformed features. Moreover, the training of the neural networks uses the category information that corresponds to a state in HMMs so that the trained networks can better accommodate the temporal variability of features and obtain more discriminative features in a low dimensional space. Experimental evaluation using the TIMIT database shows that recognition accuracies with the transformed features are slightly higher than those obtained with original features and considerably higher than obtained with linear dimensionality reduction methods. The highest phone accuracy obtained with 39 phone classes and TIMIT was 74.9% using a large number of training iterations based on the state-specific targets.
Keywords :
feature extraction; hidden Markov models; neural nets; pattern classification; speech processing; speech recognition; HMM phonetic recognition; HMM- based phone recognition system; TIMIT database; feature classifier; linear dimensionality reduction method; low dimensional space; neural network; nonlinear feature dimensionality reduction method; speech feature; temporal features variability; Hidden Markov models; Linear discriminant analysis; Multi-layer neural network; Neural networks; Principal component analysis; Spatial databases; Speech recognition; State estimation; HMMs; dimensionality reduction; neural networks; nonlinear discriminant analysis;
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2010.5495130