• DocumentCode
    1833192
  • Title

    Using likelihood L-statistics to measure confidence in audio-visual speech recognition

  • Author

    Ghosh, Arpita ; Verma, Ashish ; Sarkar, Abhinanda

  • Author_Institution
    Indian Inst. of Technol., Mumbai, India
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    27
  • Lastpage
    32
  • Abstract
    This paper describes previous work on decision fusion in audio-visual speech recognition. A novel approach is proposed to combine audio and video channel information in audio-visual speech recognition scenario. We have considered frame-level phonetic classification problem using two single-stream Gaussian mixture models. Audio and video streams are adaptively weighted using a cumulative mean of the sample confidence values over past frames in addition to the present sample confidence value. The confidence values for audio and video decisions are computed using an L-statistics (linear combination of order-statistics) of log-likelihoods against phone models. It is shown through various experiments, on a database of about 15000 sentences from large vocabulary continuous speech, that the proposed approach results in better classification accuracy as compared to other approaches
  • Keywords
    Gaussian processes; audio signal processing; audio-visual systems; signal classification; signal sampling; speech recognition; statistical analysis; Gaussian mixture models; audio channel information; audio streams; audio-visual speech recognition; classification accuracy; confidence measurement; cumulative mean; frame-level phonetic classification; large vocabulary continuous speech; likelihood L-statistics; linear order-statistics; log-likelihoods; phone models; sample confidence values; sentences database; video channel information; video streams; Acoustic noise; Databases; Ear; Entropy; Noise measurement; Noise robustness; Signal to noise ratio; Speech recognition; Streaming media; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia Signal Processing, 2001 IEEE Fourth Workshop on
  • Conference_Location
    Cannes
  • Print_ISBN
    0-7803-7025-2
  • Type

    conf

  • DOI
    10.1109/MMSP.2001.962707
  • Filename
    962707