DocumentCode :
3192789
Title :
A comparison of feature representations for speaker-independent voiced-stop-consonant recognition
Author :
Bryant, Benjamin D. ; Gowdy, John N.
Author_Institution :
Dept. of Electr. & Comput. Eng., Clemson Univ., SC, USA
fYear :
1993
fDate :
4-7 Apr 1993
Firstpage :
0.75
Abstract :
The authors investigated various feature representations of speech which seem to provide robust estimates of machine-recognition-relevant parameters for the voiced-stop-consonant phoneme class. Instances of a block-windowed neural network (BWNN) were trained and tested using feature vectors extracted from data of up to four dialect regions in the TIMIT database. Three feature representations were chosen for use in this research based on their past performance in consulted feature representation studies. It is concluded that the feature representations produced by Seneff´s (1988) auditory model particularly the mean-rate response representation, are good representations for voiced-stop consonant speech as well as vowel speech. It is also concluded that the addition of dynamic feature information in the form of differenced cepstral coefficients to the conglomerate mel-cepstral representative vectors made a difference in the recognition rate for voiced-stop consonants over the use of the mel-frequency cepstral coefficients alone. It can be hypothesized that the use of the BWNN architectures produced better recognition results than the use of other architectures that do not take into account the time and frequency variabilities encountered in utterances from different speakers
Keywords :
cepstral analysis; neural nets; signal representation; speech recognition; TIMIT database; block-windowed neural network; dialect regions; differenced cepstral coefficients; dynamic feature information; feature representations; feature vectors; machine-recognition-relevant parameters; mean-rate response representation; mel-cepstral representative vectors; mel-frequency cepstral coefficients; performance; recognition rate; robust estimates; speaker-independent voiced-stop-consonant recognition; voiced-stop consonant speech; vowel speech; Cepstral analysis; Data mining; Feature extraction; Frequency; Neural networks; Parameter estimation; Robustness; Spatial databases; Speech recognition; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Southeastcon '93, Proceedings., IEEE
Conference_Location :
Charlotte, NC
Print_ISBN :
0-7803-1257-0
Type :
conf
DOI :
10.1109/SECON.1993.465782
Filename :
465782
Link To Document :
بازگشت