DocumentCode :
1830156
Title :
Visual Speech Detection Using an Unsupervised Learning Framework
Author :
Ahmad, Rabiah ; Raza, Syed Paymaan ; Malik, Haroon
Author_Institution :
ECE Dept., Univ. of Michigan - Dearborn, Dearborn, MI, USA
Volume :
2
fYear :
2013
fDate :
4-7 Dec. 2013
Firstpage :
525
Lastpage :
528
Abstract :
This paper presents an unsupervised learning framework for visual speech detection. Bimodal GMM is used to model visual features, i.e., mouth region intensity, which varies during speech. Variation in the mouth region intensity is used for visual speech and non-speech classification. The GMM parameters are estimated using the EM algorithm. Performance of the proposed algorithm is evaluated using a dataset consisting of 14 video clips containing almost 20, 000 frames. Performance of the proposed algorithm is also compared with existing state-of-the-art. Experimental results show that the proposed method achieves high detection and low false alarm rates.
Keywords :
Gaussian processes; expectation-maximisation algorithm; feature extraction; image classification; learning (artificial intelligence); mixture models; video signal processing; EM algorithm; Gaussian mixture model; bimodal GMM parameter estimation; detection rates; expectation maximization algorithm; false alarm rates; mouth region intensity; performance evaluation; unsupervised learning framework; video clips; visual feature modelling; visual nonspeech classification; visual speech classification; visual speech detection; Cavity resonators; Signal processing algorithms; Speech; Teeth; Video sequences; Visualization; Expectation Maximization (EM); Gaussian Mixture Model (GMM); Unsupervised Learning; Voice Activity Detection (VAD);
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Applications (ICMLA), 2013 12th International Conference on
Conference_Location :
Miami, FL
Type :
conf
DOI :
10.1109/ICMLA.2013.171
Filename :
6786164
Link To Document :
بازگشت