• DocumentCode
    701486
  • Title

    Asynchronous integration of audio and visual sources in bi-modal automatic speech recognition

  • Author

    Deleglise, Paul ; Rogozan, Alexandrina ; Alissali, Mamoun

  • Author_Institution
    LIUM, University of Maine, Av. Olivier Messiaen, BP 535, 72017 Le Mans Cedex, France
  • fYear
    1996
  • fDate
    10-13 Sept. 1996
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    This paper presents our work on the integration of visual data in automatic speech recognition systems. We particularly aim at solving two problems: • classifiation differences for the modeling of acoustic information (phonemes) and visual information (visemes); • the phenomena of anticipation and retention of visemes on the corresponding phonemes. We developed and tested three systems, each dealing with one or both problems and proposing a different integration strategy. The comparison of system performances show that some of the solutions we propose give satisfactory results, and suggest that further work on some others would lead to more performance improvement.
  • Keywords
    Acoustics; Hidden Markov models; Noise; Shape; Speech; Speech recognition; Visualization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    European Signal Processing Conference, 1996. EUSIPCO 1996. 8th
  • Conference_Location
    Trieste, Italy
  • Print_ISBN
    978-888-6179-83-6
  • Type

    conf

  • Filename
    7083212