• DocumentCode
    1702883
  • Title

    Enhancement of VSR using low dimension visual feature

  • Author

    Upadhyaya, Parag ; Farooq, Omar ; Varshney, Praveen ; Upadhyaya, Ajay

  • Author_Institution
    Dept. of Electron. & Commun. Eng., SRI-Datia, Datia, India
  • fYear
    2013
  • Firstpage
    71
  • Lastpage
    74
  • Abstract
    This paper presents a study about the low dimension visual (LDV) space features and investigates the improvement in audio visual automatic speech recognition using different set of visual features. The experiment is divided into three sub-sections; in first phase the recognition is performed on 12 static DCT features; in second phase the recognition is performed for combination of 6 static and 6 dynamic features and in third phase the recognition is performed on 12 low dimension DCT feature. For this research work Hindi AMUAV (Aligarh Muslim University Audio-Visual) database was developed in which audio sample at 44.1 kHz and video sample at 25 frames per second was opted. Hidden Markov Model (HMM) tool kit with left-right HMMs modeled was used for recognition and an overall improvement of 26.04% in word recognition is achieved with LDV space features.
  • Keywords
    audio-visual systems; discrete cosine transforms; feature extraction; hidden Markov models; speech recognition; Aligarh Muslim University; HMM; Hindi AMUAV database; LDV space features; VSR enhancement; audio sample; audio visual automatic speech recognition; audio-visual database; frequency 44.1 kHz; hidden Markov model; low dimension DCT feature; low dimension visual space features; static DCT features; visual speech recognition; word recognition; Databases; Discrete cosine transforms; Feature extraction; Hidden Markov models; Speech; Speech recognition; Visualization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia, Signal Processing and Communication Technologies (IMPACT), 2013 International Conference on
  • Conference_Location
    Aligarh
  • Print_ISBN
    978-1-4799-1202-5
  • Type

    conf

  • DOI
    10.1109/MSPCT.2013.6782090
  • Filename
    6782090