DocumentCode :
3427324
Title :
Mandarin vowel pronunciation quality evaluation by a novel formant classification method and its combination with traditional algorithms
Author :
Pan, Fuping ; Zhao, Qingwei ; Yan, Yonghong
Author_Institution :
ThinkIT Lab., Chinese Acad. of Sci., Beijing
fYear :
2008
fDate :
March 31 2008-April 4 2008
Firstpage :
5061
Lastpage :
5064
Abstract :
This paper discusses the vowel pronunciation quality assessment of our computer assisted Mandarin Chinese learning system. Under the speech recognition framework, phonetic pronunciation assessment is usually based on the phonetic posterior probability score, which may be computed by normalizing the frame-based posterior probability or be calculated on the phone segment directly. By the first method, we can achieve a human-machine scoring correlation coefficient (CC) of 0.832 for vowel; and by the second, the CC can be up to 0.847. In order to improve the performance, we suggest employing the formant feature of vowel. This paper proposes a novel method to utilize formant: we plot formant candidates of each frame on the time-frequency plane to form a bitmap, and then extract its Gabor feature for pattern classification. When we use the classification probability score for pronunciation assessment, we get a CC of 0.842. Finally we combine the three scores with various linear or nonlinear methods; the best CC of 0.913 is gotten by using neural network.
Keywords :
computer aided instruction; feature extraction; pattern classification; speech processing; speech recognition; Gabor feature extraction; Mandarin vowel pronunciation quality evaluation; computer assisted Mandarin Chinese learning system; formant classification method; human-machine scoring correlation coefficient; neural network; pattern classification probability score; phonetic frame-based posterior probability score; speech recognition framework; time-frequency plane; Decoding; Feature extraction; Hidden Markov models; Mel frequency cepstral coefficient; Neural networks; Probability; Quality assessment; Speech recognition; Time frequency analysis; Viterbi algorithm; Computer Assisted Language Learning; Formant; Gabor Feature; Neural Network; Speech Recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location :
Las Vegas, NV
ISSN :
1520-6149
Print_ISBN :
978-1-4244-1483-3
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2008.4518796
Filename :
4518796
Link To Document :
بازگشت