Author_Institution :
Inst. of Acoust., Acad. Sinica, Beijing, China
Abstract :
A novel method for automatically extracting stop-oriented features from Chinese speech using wavelet transforms is presented. This method classifies Chinese voiceless consonants into two stop subsets, BDG: {b,d,g} and zZHJGPTcCHQK: {z,zh,j,g,p,t,c,ch,q,k}, and one fricative subset FsSHhX: {f,s,sh,x,h}. For each speech token of C-V syllables (V- denotes voiceless consonant), the algorithm calculates detection objectives and outputs boundary marks of the voiceless consonants and one of symbols out of {b.d.g, STOP/BD, f.s.sh.x.h} as its category markers. The validity of the method was tested on a subset of 913 C-V syllables extracted from a database consisting of 1276 Chinese all-syllable tokens, with hand-labeled initial and final segment markers as the benchmark of the test, resulting in a classification accuracy of 96.1%, 95.1%, and 89.0% for category b.d.g, STOP/BD, and f.s.sh.x.h respectively, and in an average accuracy of 93.6% for all of the 913 C-V syllables
Keywords :
feature extraction; pattern classification; speech recognition; wavelet transforms; Chinese speech; all-syllable tokens; automatic extraction; classification accuracy; fricative subset; speech recognition; stop subsets; stop-oriented feature extraction; syllables extraction; voiceless consonants; wavelet transform; Acoustic waves; Capacitance-voltage characteristics; Databases; Explosions; Feature extraction; Frequency; Speech processing; Speech recognition; System testing; Wavelet transforms;