Acoustically-Driven Talking Face Synthesis using Dynamic Bayesian Networks

Author

Xue, Jianxia ; Borgstrom, Jonas ; Jiang, Jintao ; Bernstein, Lynne E. ; Alwan, Abeer

Author_Institution

California Univ., Los Angeles, CA

fYear

2006

fDate

9-12 July 2006

Firstpage

1165

Lastpage

1168

Abstract

Dynamic Bayesian networks (DBNs) have been widely studied in multi-modal speech recognition applications. Here, we introduce DBNs into an acoustically-driven talking face synthesis system. Three prototypes of DBNs, namely independent, coupled, and product HMMs were studied. Results showed that the DBN methods were more effective in this study than a multilinear regression baseline. Coupled and product HMMs performed similarly better than independent HMMs in terms of motion trajectory accuracy. Audio and visual speech asynchronies were represented differently for coupled HMMs versus product HMMs

Keywords

acoustics; audio-visual systems; belief networks; face recognition; hidden Markov models; speech processing; speech recognition; speech synthesis; visual perception; DBN; HMM; acoustically-driven talking face synthesis system; audio-visual speech; dynamic Bayesian network; hidden Markov model; multimodal speech recognition application; Bayesian methods; Context modeling; Feature extraction; Hidden Markov models; Network synthesis; Optical devices; Optical noise; Prototypes; Speech recognition; Speech synthesis;

fLanguage

English

Publisher

ieee

Conference_Titel

Multimedia and Expo, 2006 IEEE International Conference on

Conference_Location

Toronto, Ont.

Print_ISBN

1-4244-0366-7

Electronic_ISBN

1-4244-0367-7

Type

conf

DOI

10.1109/ICME.2006.262743

Filename

4036812