مرکز منطقه ای اطلاع رساني علوم و فناوري - Some acoustic improvements for pronunciation quality assessment for strongly accented mandarin speech

DocumentCode :

2425968

Title :

Some acoustic improvements for pronunciation quality assessment for strongly accented mandarin speech

Author :

Ge, Fengpei ; Pan, Fuping ; Liu, Changliang ; Dong, Bin ; Zhao, Qingwei ; Yan, Yonghong

Author_Institution :

ThinkIT Lab., Chinese Acad. of Sci., Beijing

fYear :

2008

fDate :

7-9 July 2008

Firstpage :

691

Lastpage :

696

Abstract :

This paper presents our recent study in resolving some specific acoustic problems of the computer assisted language learning (CALL) system by modifying the acoustic model (AM) and feature under ASR framework. Firstly, speaker dependent cepstrum mean normalization (Speaker CMN) is adopted to alleviate the distortion of channel, with which the average human-machine scoring correlation coefficient (ACC) is improved from 78.00% to 84.14%. Heteroscedastic linear discriminate analysis (HLDA) is then applied to enhance the discrimination ability of AM, which successfully increases ACC from 84.14% to 84.62%. Additionally, HLDA can lessen the great human-machine scoring difference of those speeches that have very good or too bad pronunciation quality, and so lead to an increase of the correctly-rank rate (CRR) from 85.59% to 90.99%. Finally, we use maximum a posteriori (MAP) to tune AM to match the strong accented test speech. As the result, ACC is improved from 84.62% to 86.57%.

Keywords :

human computer interaction; maximum likelihood estimation; speech processing; Mandarin speech; acoustic improvements; acoustic model; computer assisted language learning system; correctly-rank rate; heteroscedastic linear discriminate analysis; human-machine scoring correlation coefficient; human-machine scoring difference; maximum a posteriori; pronunciation quality; pronunciation quality assessment; speaker dependent cepstrum mean normalization; Acoustic distortion; Automatic speech recognition; Cepstral analysis; Decoding; Hidden Markov models; Loudspeakers; Man machine systems; Natural languages; Quality assessment; Viterbi algorithm;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Audio, Language and Image Processing, 2008. ICALIP 2008. International Conference on

Conference_Location :

Shanghai

Print_ISBN :

978-1-4244-1723-0

Electronic_ISBN :

978-1-4244-1724-7

Type :

conf

DOI :

10.1109/ICALIP.2008.4590175

Filename :

4590175

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2425968