Development of a visual speech synthesizer via second-order isomorphism

Author

Jiang, Jintao ; Aronoff, Justin M. ; Bernstein, Lynne E.

Author_Institution

Dept. of Commun., Neurosci. House Ear Inst., Los Angeles, CA

fYear

2008

fDate

March 31 2008-April 4 2008

Firstpage

4677

Lastpage

4680

Abstract

The goals of this study were to evaluate the synthesis of visible speech that was based on 3-D motion data using second-order isomorphism. To do this, word stimuli were generated for perceptual discrimination and identification tasks. Discrimination trials were based on word-pairs that were predicted to be at four levels of perceptual dissimilarity. Results from the discrimination tasks indicated that visual synthetic speech perception maintained the dissimilarity structure of visual natural speech perception. This study demonstrated that the relatively sparse 3-D representations of face motion could be used to synthesize visual speech that perceptually approximate visual natural speech, suggesting that synthesizer development and psychophysics can benefit mutually when the goals are aligned.

Keywords

speech processing; speech synthesis; 3D motion data; face motion; perceptual discrimination; perceptual dissimilarity; second-order isomorphism; visual natural speech perception; visual speech synthesizer; visual synthetic speech perception; word-pairs; Acoustic devices; Deafness; Humans; Natural languages; Optical feedback; Optical sensors; Signal synthesis; Speech analysis; Speech synthesis; Synthesizers; Visual speech synthesis; dissimilarity; second-order isomorphism; visual speech perception;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on

Conference_Location

Las Vegas, NV

ISSN

1520-6149

Print_ISBN

978-1-4244-1483-3

Electronic_ISBN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2008.4518700

Filename

4518700