• DocumentCode
    3735347
  • Title

    Lip-based visual speech recognition system

  • Author

    Aufaclav Zatu Kusuma Frisky;Chien-Yao Wang;Andri Santoso;Jia-Ching Wang

  • Author_Institution
    Department of Computer Science and Information Engineering, National Central University, Taiwan, R.O.C.
  • fYear
    2015
  • Firstpage
    315
  • Lastpage
    319
  • Abstract
    This paper proposes a system to address the problem of visual speech recognition. The proposed system is based on visual lip movement recognition by applying video content analysis technique. Using spatiotemporal features descriptors, we extracted features from video containing visual lip information. A preprocessing step is employed by removing the noise and enhancing the contrast of images in every frames of video. Extracted feature are used to build a dictionary for kernel sparse representation classifier (K-SRC) in the classification step. We adopted non-negative matrix factorization (NMF) method to reduce the dimensionality of the extracted features. We evaluated the performance of our system using AVLetters and AVLetters2 dataset. To evaluate the performance of our system, we used the same configuration as another previous works. Using AVLetters dataset, the promising accuracies of 67.13%, 45.37%, and 63.12% can be achieved in semi speaker dependent, speaker independent, and speaker dependent, respectively. Using AVLetters2 dataset, our method can achieve accuracy rate of 89.02% for speaker dependent case and 25.9% for speaker independent. This result showed that our proposed method outperforms another methods using same configuration.
  • Keywords
    "Feature extraction","Kernel","Visualization","Speech recognition","Dictionaries","Testing","Mouth"
  • Publisher
    ieee
  • Conference_Titel
    Security Technology (ICCST), 2015 International Carnahan Conference on
  • Print_ISBN
    978-1-4799-8690-3
  • Electronic_ISBN
    2153-0742
  • Type

    conf

  • DOI
    10.1109/CCST.2015.7389703
  • Filename
    7389703