مرکز منطقه ای اطلاع رساني علوم و فناوري - Lip synchronization using linear predictive analysis

DocumentCode :

2454393

Title :

Lip synchronization using linear predictive analysis

Author :

Kshirsagar, Sumedha ; Magnenat-Thalmann, Nadia

Author_Institution :

MIRALAB, Geneva Univ., Switzerland

Volume :

fYear :

2000

fDate :

2000

Firstpage :

1077

Abstract :

Linear predictive analysis is a widely used technique for speech analysis and encoding. The authors discuss the issues involved in its application to phoneme extraction and lip synchronization. The LP analysis results in a set of reflection coefficients that are closely related to the vocal tract shape. Since the vocal tract shape can be correlated with the phoneme being spoken, LP analysis can be directly applied to phoneme extraction. We use neural networks to train and classify the reflection coefficients into a set of vowels. In addition, average energy is used to take care of vowel-vowel and vowel-consonant transitions, whereas the zero crossing information is used to detect the presence of fricatives. We directly apply the extracted phoneme information to our synthetic 3D face model. The proposed method is fast, easy to implement, and adequate for real time speech animation. As the method does not rely on language structure or speech recognition, it is language independent. Moreover, the method is speaker independent. It can be applied to lip synchronization for entertainment applications and avatar animation in virtual environments

Keywords :

computer animation; neural nets; real-time systems; speech processing; synchronisation; LP analysis; avatar animation; average energy; entertainment applications; fricatives; language independent method; language structure; linear predictive analysis; lip synchronization; neural networks; phoneme; phoneme extraction; phoneme information; real time speech animation; reflection coefficients; speech analysis; speech encoding; speech recognition; synthetic 3D face model; virtual environments; vocal tract shape; vowel-consonant transitions; zero crossing information; Data mining; Encoding; Face detection; Facial animation; Natural languages; Neural networks; Reflection; Shape; Speech analysis; Speech recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Multimedia and Expo, 2000. ICME 2000. 2000 IEEE International Conference on

Conference_Location :

New York, NY

Print_ISBN :

0-7803-6536-4

Type :

conf

DOI :

10.1109/ICME.2000.871547

Filename :

871547

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2454393