DocumentCode :
346162
Title :
A segmental time-alignment technique for text-speech synchronization
Author :
Vignoli, Fabio ; Lavagetto, Fabio
Author_Institution :
Genoa Univ., Italy
fYear :
1999
fDate :
1999
Firstpage :
408
Lastpage :
412
Abstract :
The bimodal acoustic-visual effect is of extreme importance in human face-to-face communication; it has been broadly investigated and the improvement in understanding when visual cues are integrated with speech has been clearly demonstrated, with particular emphasis in noisy environments. In this paper, we propose a novel synchronization procedure for speech and text, consisting of a neural network-based acoustic segmentation method for phoneme classes and a phonetic-acoustic time alignment algorithm which we call Segmental Time-Alignment (STA). The proposed algorithm is fast and speaker-independent since it uses neural networks trained to discriminate among broad phoneme classes. This technique has been used to animate the MPEG-4 compliant DIST face model
Keywords :
computer animation; multimedia systems; neural nets; speech processing; synchronisation; MPEG-4 compliant DIST face model animation; bimodal acoustic-visual effect; human face-to-face communication; neural network-based acoustic segmentation; noisy environments; phoneme class discrimination; phonetic-acoustic time alignment algorithm; segmental time-alignment technique; speaker-independent algorithm; text-speech synchronization; visual cues; Acoustic noise; Databases; Facial animation; Information geometry; Lips; MPEG 4 Standard; Magnetic heads; Neural networks; Speech analysis; Working environment noise;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence and Multimedia Applications, 1999. ICCIMA '99. Proceedings. Third International Conference on
Conference_Location :
New Delhi
Print_ISBN :
0-7695-0300-4
Type :
conf
DOI :
10.1109/ICCIMA.1999.798565
Filename :
798565
Link To Document :
بازگشت