• DocumentCode
    3249489
  • Title

    Dynamic visual features based on discriminative speech class projection for visual speech recognition

  • Author

    Lei, Xie ; Xiu-Li, Cai ; Zhong-Hua, Fu ; Rong-Chun, Zhao

  • Author_Institution
    Sch. of Comput. Sci., Northwestern Polytech. Univ., Xi´´an, China
  • fYear
    2004
  • fDate
    20-22 Oct. 2004
  • Firstpage
    687
  • Lastpage
    690
  • Abstract
    This paper presents a dynamic visual feature extraction scheme to capture important lip motion information for visual speech recognition. Discriminative projections based on a-priori chosen speech classes, phonemes and visemes, are applied to the concatenation of pre-extracted static visual features. First- and second-order temporal derivatives are subsequently extracted to further represent the dynamic differences. Experiments on a connected digits task demonstrate that the proposed high discriminative dynamic features, when augmented to the static, yields superior recognition performance. Compared to the commonly used delta and acceleration features, the proposed dynamic feature leads to an 8% absolute improvement in terms of word accuracy for the considered recognition task.
  • Keywords
    feature extraction; hidden Markov models; image sequences; speech recognition; MPEG-1 video; concatenated pre-extracted static visual features; discriminative dynamic features; discriminative speech class projection; dynamic visual feature extraction; linear discriminant analysis; lip motion information; mouth image sequences; phonemes; temporal derivatives; visemes; visual speech recognition; word accuracy; Acoustic noise; Auditory system; Automatic speech recognition; Data mining; Feature extraction; Hidden Markov models; Humans; Noise robustness; Speech processing; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Multimedia, Video and Speech Processing, 2004. Proceedings of 2004 International Symposium on
  • Print_ISBN
    0-7803-8687-6
  • Type

    conf

  • DOI
    10.1109/ISIMP.2004.1434157
  • Filename
    1434157