Statistical phone duration modeling to filter for intact utterances in a computer-assisted pronunciation training system

Author

Lo, Wai-Kit ; Harrison, Alissa M. ; Meng, Helen

Author_Institution

Chinese Univ. of Hong Kong, Hong Kong, China

fYear

2010

fDate

14-19 March 2010

Firstpage

5238

Lastpage

5241

Abstract

We study the use of a statistical phone duration model for separating intact utterances from corrupted ones in a computer-assisted pronunciation training system. Our system performs forced alignment between the input utterance and the canonical transcription of the prompted text. Intact utterances contain spoken content that correspond to the text prompt. For these utterances, our system performs detailed phonetic analysis of the alignment and generates corrective feedback to highlight the occurrence of phonetic errors. Corrupted utterances result from disfluencies, truncated recordings, or spoken content that does not correspond to the text prompt. For these cases, the appropriate feedback is to invite the user to record again. We develop a filtering mechanism for intact input utterances by means of phone duration modeling. The likelihood-ratiotest involving the phone-specific duration probability and an antimodel probability gave the best EER of 17.16%, which is a 20% relative improvement over the baseline approach that incorporates phone-posterior probabilities.

Keywords

computer based training; probability; speech processing; canonical transcription; computer-assisted pronunciation training system; filtering mechanism; intact utterances; phone-posterior probabilities; phonetic analysis; statistical phone duration modeling; Automatic speech recognition; Computer interfaces; Dictionaries; Error correction; Feedback; Filtering; Filters; Hidden Markov models; Performance analysis; User interfaces; computer-aided pronunciation training; phone duration modeling; user interface;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on

Conference_Location

Dallas, TX

ISSN

1520-6149

Print_ISBN

978-1-4244-4295-9

Electronic_ISBN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2010.5494988

Filename

5494988