• DocumentCode
    2789225
  • Title

    Statistical phone duration modeling to filter for intact utterances in a computer-assisted pronunciation training system

  • Author

    Lo, Wai-Kit ; Harrison, Alissa M. ; Meng, Helen

  • Author_Institution
    Chinese Univ. of Hong Kong, Hong Kong, China
  • fYear
    2010
  • fDate
    14-19 March 2010
  • Firstpage
    5238
  • Lastpage
    5241
  • Abstract
    We study the use of a statistical phone duration model for separating intact utterances from corrupted ones in a computer-assisted pronunciation training system. Our system performs forced alignment between the input utterance and the canonical transcription of the prompted text. Intact utterances contain spoken content that correspond to the text prompt. For these utterances, our system performs detailed phonetic analysis of the alignment and generates corrective feedback to highlight the occurrence of phonetic errors. Corrupted utterances result from disfluencies, truncated recordings, or spoken content that does not correspond to the text prompt. For these cases, the appropriate feedback is to invite the user to record again. We develop a filtering mechanism for intact input utterances by means of phone duration modeling. The likelihood-ratiotest involving the phone-specific duration probability and an antimodel probability gave the best EER of 17.16%, which is a 20% relative improvement over the baseline approach that incorporates phone-posterior probabilities.
  • Keywords
    computer based training; probability; speech processing; canonical transcription; computer-assisted pronunciation training system; filtering mechanism; intact utterances; phone-posterior probabilities; phonetic analysis; statistical phone duration modeling; Automatic speech recognition; Computer interfaces; Dictionaries; Error correction; Feedback; Filtering; Filters; Hidden Markov models; Performance analysis; User interfaces; computer-aided pronunciation training; phone duration modeling; user interface;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
  • Conference_Location
    Dallas, TX
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-4295-9
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2010.5494988
  • Filename
    5494988