DocumentCode
1927554
Title
Database Construction for Speech to Lip-readable Animation Conversion
Author
Acs, Gyorgy Ta ; Tihanyi, Atilla ; Bardi, Tamas ; Feldhoffer, Gergo ; Srancsi, Balint
Author_Institution
Fac. of Inf. Technol., Peter Pazmany Catholic Univ., Budapest
fYear
2006
fDate
38869
Firstpage
151
Lastpage
154
Abstract
The training database was one of the critical element in our speech to facial animation conversion system. This system was developed as a communication aid for deaf people. The specific database was constructed from audio and visual records of professional lip-speakers. The standardized MPEG-4 system was used to animate the talking head model. The trained neural net is able to calculate with acceptable error the principal component weights of feature points from the speech frames. The feature point coordinates are calculated from PC weights. The whole system can be implemented in mobile phones. Deaf persons were able to recognize about 50 of words from the speech driven animation in the final test
Keywords
audio databases; audio-visual systems; computer animation; data compression; learning (artificial intelligence); principal component analysis; speaker recognition; MPEG-4 system; audio-visual record; database construction; mobile phone; principal component weight; speech recognition; speech-lip readable animation conversion system; trained neural net; Audio databases; Deafness; Facial animation; MPEG 4 Standard; Magnetic heads; Mobile handsets; Neural networks; Spatial databases; Speech; Visual databases; Audiovisual speech processing; facial animation; lip reading; multimodal communication;
fLanguage
English
Publisher
ieee
Conference_Titel
Multimedia Signal Processing and Communications, 48th International Symposium ELMAR-2006 focused on
Conference_Location
Zadar
ISSN
1334-2630
Print_ISBN
953-7044-03-3
Type
conf
DOI
10.1109/ELMAR.2006.329537
Filename
4127510
Link To Document