DocumentCode
701486
Title
Asynchronous integration of audio and visual sources in bi-modal automatic speech recognition
Author
Deleglise, Paul ; Rogozan, Alexandrina ; Alissali, Mamoun
Author_Institution
LIUM, University of Maine, Av. Olivier Messiaen, BP 535, 72017 Le Mans Cedex, France
fYear
1996
fDate
10-13 Sept. 1996
Firstpage
1
Lastpage
4
Abstract
This paper presents our work on the integration of visual data in automatic speech recognition systems. We particularly aim at solving two problems: • classifiation differences for the modeling of acoustic information (phonemes) and visual information (visemes); • the phenomena of anticipation and retention of visemes on the corresponding phonemes. We developed and tested three systems, each dealing with one or both problems and proposing a different integration strategy. The comparison of system performances show that some of the solutions we propose give satisfactory results, and suggest that further work on some others would lead to more performance improvement.
Keywords
Acoustics; Hidden Markov models; Noise; Shape; Speech; Speech recognition; Visualization;
fLanguage
English
Publisher
ieee
Conference_Titel
European Signal Processing Conference, 1996. EUSIPCO 1996. 8th
Conference_Location
Trieste, Italy
Print_ISBN
978-888-6179-83-6
Type
conf
Filename
7083212
Link To Document