DocumentCode
3523191
Title
Explicit modeling of vowel coarticulation in continuous speech recognition
Author
Hieronymus, James L.
Author_Institution
US Nat. Inst. for Stand. & Technol., Gaithersburg, MD, USA
fYear
1989
fDate
23-26 May 1989
Firstpage
608
Abstract
An ongoing study is reported of all sixteen of the American English vowels using subsets of the DARPA acoustic-phonetic database. Formants are obtained and normalized for each talker´s formant range based on one sentence. The resulting formant tracks are smoothed using splines and sampled at nine equally spaced points in time within vowel-centered triphone regions. Triphones with semivowels in them are clustered separately. These formant values are k -means clustered using subsets of the sampled formant values. The additional supervised training is done using other parameters, including duration. The resulting clusters are used as a classifier on the basis of the modified Euclidean distance from the cluster centers. This results in approximately 80% first choice vowel recognition of the outer edges of the vowel quadrilateral. Stressed vowels were found to have spectra which statistically were no more stable than unstressed vowels
Keywords
speech analysis and processing; speech recognition; American English vowels; DARPA acoustic-phonetic database; continuous speech recognition; duration; formant tracks; k-means clustering; modelling; modified Euclidean distance; sampled formant values; stressed vowels; supervised training; unstressed vowels; vowel coarticulation; vowel recognition; Acoustic distortion; Context modeling; Euclidean distance; Frequency; Isolation technology; Speech recognition; Steady-state;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 1989. ICASSP-89., 1989 International Conference on
Conference_Location
Glasgow
ISSN
1520-6149
Type
conf
DOI
10.1109/ICASSP.1989.266500
Filename
266500
Link To Document