DocumentCode :
312204
Title :
Maximum likelihood learning of auditory feature maps for stationary vowels
Author :
Wang, Kuansan ; Lee, Chin-Hui ; Juang, Biing-hwang
Author_Institution :
AT&T Bell Labs., Murray Hill, NJ, USA
Volume :
2
fYear :
1996
fDate :
3-6 Oct 1996
Firstpage :
1265
Abstract :
A mathematical framework for learning the acoustic features from a central auditory representation is presented. The authors adopt a statistical approach that models the leaning process as to achieve a maximum likelihood estimation of the signal distribution. An algorithm, called statistical marching pursuit (SMP), is introduced to identify regions on the cortical surface when the features for each sound class are most prominent. They model the features with distributions of Gaussian mixture densities, and employ the expectation-maximization (EM) procedure to both improve the parameterization and refine iteratively the selection of cortical regions from which the features are extracted. The learning algorithm is applied to vowel classification on the TIMIT database where all the vowels (excluding diphthongs, nine in total) are regarded as individual classes. Experimental results show that models trained under the SMP/EM algorithm achieve a recognition accuracy comparable to that of conventional recognizers
Keywords :
Gaussian distribution; feature extraction; maximum likelihood estimation; pattern matching; speech recognition; statistical analysis; Gaussian mixture densities; TIMIT database; acoustic feature learning; auditory feature maps; central auditory representation; cortical surface; expectation-maximization procedure; feature extraction; iterative refinement; mathematical framework; maximum likelihood estimation; maximum likelihood learning; parameterization; recognition accuracy; signal distribution; sound class; stationary vowels; statistical approach; statistical marching pursuit algorithm; vowel classification; Automatic speech recognition; Feature extraction; Iterative algorithms; Matching pursuit algorithms; Maximum likelihood estimation; Pattern recognition; Signal processing; Signal processing algorithms; Speech recognition; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
0-7803-3555-4
Type :
conf
DOI :
10.1109/ICSLP.1996.607840
Filename :
607840
Link To Document :
بازگشت