Title :
Explicit modeling of vowel coarticulation in continuous speech recognition
Author :
Hieronymus, James L.
Author_Institution :
US Nat. Inst. for Stand. & Technol., Gaithersburg, MD, USA
Abstract :
An ongoing study is reported of all sixteen of the American English vowels using subsets of the DARPA acoustic-phonetic database. Formants are obtained and normalized for each talker´s formant range based on one sentence. The resulting formant tracks are smoothed using splines and sampled at nine equally spaced points in time within vowel-centered triphone regions. Triphones with semivowels in them are clustered separately. These formant values are k-means clustered using subsets of the sampled formant values. The additional supervised training is done using other parameters, including duration. The resulting clusters are used as a classifier on the basis of the modified Euclidean distance from the cluster centers. This results in approximately 80% first choice vowel recognition of the outer edges of the vowel quadrilateral. Stressed vowels were found to have spectra which statistically were no more stable than unstressed vowels
Keywords :
speech analysis and processing; speech recognition; American English vowels; DARPA acoustic-phonetic database; continuous speech recognition; duration; formant tracks; k-means clustering; modelling; modified Euclidean distance; sampled formant values; stressed vowels; supervised training; unstressed vowels; vowel coarticulation; vowel recognition; Acoustic distortion; Context modeling; Euclidean distance; Frequency; Isolation technology; Speech recognition; Steady-state;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1989. ICASSP-89., 1989 International Conference on
Conference_Location :
Glasgow
DOI :
10.1109/ICASSP.1989.266500