• DocumentCode
    2425968
  • Title

    Some acoustic improvements for pronunciation quality assessment for strongly accented mandarin speech

  • Author

    Ge, Fengpei ; Pan, Fuping ; Liu, Changliang ; Dong, Bin ; Zhao, Qingwei ; Yan, Yonghong

  • Author_Institution
    ThinkIT Lab., Chinese Acad. of Sci., Beijing
  • fYear
    2008
  • fDate
    7-9 July 2008
  • Firstpage
    691
  • Lastpage
    696
  • Abstract
    This paper presents our recent study in resolving some specific acoustic problems of the computer assisted language learning (CALL) system by modifying the acoustic model (AM) and feature under ASR framework. Firstly, speaker dependent cepstrum mean normalization (Speaker CMN) is adopted to alleviate the distortion of channel, with which the average human-machine scoring correlation coefficient (ACC) is improved from 78.00% to 84.14%. Heteroscedastic linear discriminate analysis (HLDA) is then applied to enhance the discrimination ability of AM, which successfully increases ACC from 84.14% to 84.62%. Additionally, HLDA can lessen the great human-machine scoring difference of those speeches that have very good or too bad pronunciation quality, and so lead to an increase of the correctly-rank rate (CRR) from 85.59% to 90.99%. Finally, we use maximum a posteriori (MAP) to tune AM to match the strong accented test speech. As the result, ACC is improved from 84.62% to 86.57%.
  • Keywords
    human computer interaction; maximum likelihood estimation; speech processing; Mandarin speech; acoustic improvements; acoustic model; computer assisted language learning system; correctly-rank rate; heteroscedastic linear discriminate analysis; human-machine scoring correlation coefficient; human-machine scoring difference; maximum a posteriori; pronunciation quality; pronunciation quality assessment; speaker dependent cepstrum mean normalization; Acoustic distortion; Automatic speech recognition; Cepstral analysis; Decoding; Hidden Markov models; Loudspeakers; Man machine systems; Natural languages; Quality assessment; Viterbi algorithm;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Audio, Language and Image Processing, 2008. ICALIP 2008. International Conference on
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-1-4244-1723-0
  • Electronic_ISBN
    978-1-4244-1724-7
  • Type

    conf

  • DOI
    10.1109/ICALIP.2008.4590175
  • Filename
    4590175