DocumentCode :
2988907
Title :
Identification of unaspirated plosives using integrated temporal and spectral features in dynamic representation as acoustic cues
Author :
Yeung, D.Y. ; Chan, C.
Author_Institution :
University of Hong Kong
Volume :
10
fYear :
1985
fDate :
31138
Firstpage :
1593
Lastpage :
1596
Abstract :
Using a set of integrated temporal and spectral features to parameterize the consonant regions of C-V syllables, natural Mandarin unaspirated plosives /b/,/d/,/g/,/j/,/zh/, and /z/ were identified satisfactorily when stepwise discriminant function analysis was employed for classification. Out of several different parametric representation methods used, it was found that the highest overall percentage of correct recognition was reached if a dynamic representation of the phonemes in terms of temporal and spectral features was adopted. When training and testing were performed across all vowel environments to treat any context-dependent variations as mere within-group differences, an overall recognition rate of the 6 consonants was 90% for close testing and 86% for open testing. If phonetic context was identified first by assuming knowledge of the vowels in C-V syllables (context-dependent), an average improvement of 6% in close testing was obtained for each of the vowels. The importance of these experiments is two-fold. First, Context-free distinctive cues for place-of-articulation as derived by some previous researchers were modified and generalized to apply satisfactorily to stops as well as to other unvoiced consonants in initial positions. These were obtained by a method that can practically be implemented in a real speech recognition system because of its quantitative representation of discriminating features. Secondly, context-dependent sources of information are by no means redundant. They can further improve the average rate of correct recognition by 6%, indicating the existence of important sources which are context-dependent. For purposes of generalization of the method to other manner classes, a preliminary investigation of both the identification of Mandarin aspirated plosives and fricatives using the same approach was also performed.
Keywords :
Capacitance-voltage characteristics; Information resources; Performance evaluation; Speech recognition; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '85.
Type :
conf
DOI :
10.1109/ICASSP.1985.1168079
Filename :
1168079
Link To Document :
بازگشت