Title :
A comparison of auditory models for speaker independent phoneme recognition
Author :
Anderson, Timothy R.
Author_Institution :
Armstrong Lab., Wright-Patterson AFB, OH, USA
Abstract :
Neural networks that employ unsupervised learning were used on the output of two different models of the auditory periphery to perform phoneme recognition. Experiments which compared the performance of these two auditory model representations with that of mel-cepstral coefficients show that the auditory models perform significantly better (T-test, P<.05) in terms of phoneme recognition accuracy under the conditions tested (high signal-to-noise and 10 sentences from each of 10 speakers). However, the three representations make different types of broad class recognition errors. The Payton auditory model representation performs best, with the highest overall phoneme and broad class performance. It is possible to assign an acoustic segment to one of 39 phoneme categories with at least 38% recognition accuracy using the Payton auditory model representation. The resulting context-independent phoneme-recognition performance was better than that of the SPHINX System with the same number of speakers in the training set and no smoothing of parameters.<>
Keywords :
neural nets; performance evaluation; physiological models; speech recognition; unsupervised learning; Payton auditory model representation; auditory periphery; mel-cepstral coefficients; neural nets; performance; phoneme recognition accuracy; speaker independent phoneme recognition; unsupervised learning;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1993. ICASSP-93., 1993 IEEE International Conference on
Conference_Location :
Minneapolis, MN, USA
Print_ISBN :
0-7803-7402-9
DOI :
10.1109/ICASSP.1993.319277