• DocumentCode
    2245526
  • Title

    Asynchronous integration of visual information in an automatic speech recognition system

  • Author

    Alissali, Mamoun ; Deléglise, Paul ; Rogozan, Alexandrina

  • Author_Institution
    Lab. d´´Inf., Maine Univ., Le Mans, France
  • Volume
    1
  • fYear
    1996
  • fDate
    3-6 Oct 1996
  • Firstpage
    34
  • Abstract
    Deals with that integration of visual data in automatic speech recognition systems. We first describe the framework of our research; the development of advanced multi-user multi-modal interfaces. Then we present audio-visual speech recognition problems in general, and the ones we are interested in, in particular. After a very brief discussion of existing systems, we present the architecture of our audio-only reference and baseline systems and describe our audio-visual systems. The major part of the paper describes the systems we developed according to two different approaches to the problem of integration of visual data in speech recognition systems. We first describe a system we developed according to the first approach (called the direct integration model) and show its limitations. Our approach, which we call asynchronous integration, is then presented. After the general guidelines, we go into some details about the distributed architecture and the variant of the N-best algorithm we developed for the implementation of this approach. The performances of these different systems are compared, and we conclude by a brief discussion of the performance improvements we have obtained and future work
  • Keywords
    audio-visual systems; parallel architectures; software performance evaluation; speech recognition; user interfaces; N-best algorithm; advanced multi-user multi-modal interfaces; asynchronous integration; audio-only systems architecture; audio-visual speech recognition problems; automatic speech recognition system; direct integration model; distributed architecture; performance improvements; visual information integration; Acoustic noise; Acoustic testing; Automatic speech recognition; Automatic testing; Guidelines; Noise level; Noise robustness; Probability distribution; Speech enhancement; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
  • Conference_Location
    Philadelphia, PA
  • Print_ISBN
    0-7803-3555-4
  • Type

    conf

  • DOI
    10.1109/ICSLP.1996.607018
  • Filename
    607018