DocumentCode :
1927554
Title :
Database Construction for Speech to Lip-readable Animation Conversion
Author :
Acs, Gyorgy Ta ; Tihanyi, Atilla ; Bardi, Tamas ; Feldhoffer, Gergo ; Srancsi, Balint
Author_Institution :
Fac. of Inf. Technol., Peter Pazmany Catholic Univ., Budapest
fYear :
2006
fDate :
38869
Firstpage :
151
Lastpage :
154
Abstract :
The training database was one of the critical element in our speech to facial animation conversion system. This system was developed as a communication aid for deaf people. The specific database was constructed from audio and visual records of professional lip-speakers. The standardized MPEG-4 system was used to animate the talking head model. The trained neural net is able to calculate with acceptable error the principal component weights of feature points from the speech frames. The feature point coordinates are calculated from PC weights. The whole system can be implemented in mobile phones. Deaf persons were able to recognize about 50 of words from the speech driven animation in the final test
Keywords :
audio databases; audio-visual systems; computer animation; data compression; learning (artificial intelligence); principal component analysis; speaker recognition; MPEG-4 system; audio-visual record; database construction; mobile phone; principal component weight; speech recognition; speech-lip readable animation conversion system; trained neural net; Audio databases; Deafness; Facial animation; MPEG 4 Standard; Magnetic heads; Mobile handsets; Neural networks; Spatial databases; Speech; Visual databases; Audiovisual speech processing; facial animation; lip reading; multimodal communication;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Multimedia Signal Processing and Communications, 48th International Symposium ELMAR-2006 focused on
Conference_Location :
Zadar
ISSN :
1334-2630
Print_ISBN :
953-7044-03-3
Type :
conf
DOI :
10.1109/ELMAR.2006.329537
Filename :
4127510
Link To Document :
بازگشت