Allophonic variations in visual speech synthesis for corrective feedback in CAPT

Author

Wong, Ka-Ho ; Lo, Wai-Kit ; Meng, Helen

Author_Institution

Dept. of Syst. Eng. & Eng. Manage., Chinese Univ. of Hong Kong, Hong Kong, China

fYear

2011

fDate

22-27 May 2011

Firstpage

5708

Lastpage

5711

Abstract

This paper presents a visual speech synthesizer providing midsagittal and front views of the vocal tract to help language learners to correct their mispronunciations. We adopt a set of allophonic rules to determine the visualization of allophonic variations. We also implement coarticulation by decomposing a viseme (visualization of all articulators) into viseme components (visualization of tongue, lips, jaw, and velum separately). Viseme components are morphed independently while the temporally adjacent articulations are considered Subjective evaluation involving 6 subjects with linguistic background shows that 54% of their responses prefer having allophonic variations incorporated.

Keywords

feedback; speech synthesis; CAPT; allophonic variations; computer-assisted pronunciation training; corrective feedback; viseme components; visual speech synthesis; Lips; Speech; Synthesizers; Timing; Tongue; Videos; Visualization; Audiovisual; allophone; coarticulation; language learning; synthesizer;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on

Conference_Location

Prague

ISSN

1520-6149

Print_ISBN

978-1-4577-0538-0

Electronic_ISBN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2011.5947656

Filename

5947656