• DocumentCode
    3320824
  • Title

    An Effective CALL System for Strongly Accented Mandarin Speech

  • Author

    Jiang, Tonghai ; Tang, Ming ; Ge, Fengpei ; Liu, Changliang ; Dong, Bin

  • Author_Institution
    Xinjiang Tech. Inst. of Phys. & Chem., Chinese Acad. of Sci., Wulumuqi, China
  • fYear
    2009
  • fDate
    28-29 Dec. 2009
  • Firstpage
    92
  • Lastpage
    95
  • Abstract
    In this paper, we investigate some specific acoustic problems of the computer assisted language learning (CALL) system by modifying the acoustic model and feature under the speech recognition framework. At first, in order to alleviate the distortion of channel and speaker, speaker-dependent Cepstrum Mean Normalization (Speaker CMN) is adopted, by which the average correlation coefficient (ACC) between human and machine scores is improved from 78.00% to 84.14%. Then, Heteroscedastic Linear Discriminate Analysis (HLDA) is applied to enhance the discrimination ability of acoustic model, which successfully increases ACC from 84.14% to 84.62%. Additionally, HLDA can lessen the great human-machine scoring difference of speeches that have very good or too bad quality, and so leads to an increase of the correctly-rank rate from 85.59% to 90.99%. Finally, we use the technology of Maximum a Posteriori (MAP) to tune the acoustic model to match the strongly accented testing speech. As the result, ACC is improved from 84.62% to 86.57%.
  • Keywords
    acoustic signal processing; cepstral analysis; computer aided instruction; linguistics; maximum likelihood estimation; natural language processing; speech recognition; CALL system; acoustic problem; channel distortion; computer assisted language learning; correlation coefficient; heteroscedastic linear discriminate analysis; human-machine scoring difference; maximum a posteriori; pronunciation quality assessment; speaker-dependent cepstrum mean normalization; speech recognition; strongly accented Mandarin speech; Acoustic distortion; Decoding; Hidden Markov models; Humans; Loudspeakers; Natural languages; Probability; Quality assessment; Speech recognition; Viterbi algorithm;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Research Challenges in Computer Science, 2009. ICRCCS '09. International Conference on
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-0-7695-3927-0
  • Electronic_ISBN
    978-1-4244-5410-5
  • Type

    conf

  • DOI
    10.1109/ICRCCS.2009.31
  • Filename
    5401299