Title :
Lip Reading Using Wavelet-Based Features and Random Forests Classification
Author :
Terissi, L.D. ; Parodi, M. ; Gomez, J.C.
Author_Institution :
Lab. for Syst. Dynamics & Signal Process., Univ. Nac. de Rosario, Rosario, Argentina
Abstract :
In this paper, a visual speech classification scheme based on wavelets and Random Forests (RF) is proposed. Wavelet multiresolution analysis is used to model the sequence of visual parameters, represented by either model-based or image-based features. The coefficients associated with these representations are used as features to model the visual information. Lip reading is then performed using these wavelet-based features and a Random Forests classification method. The performance of the proposed visual speech classification scheme is evaluated with three different isolated word audio-visual databases, two of them public ones and the other compiled by the authors of this paper. Experimental results show that a good performance is achieved with the proposed lip reading system over the three databases. In addition, the proposed method performs better than other reported methods in the literature over the two public databases. Experiments over the three different databases were performed using the same configuration, i.e., there was no need to adapt the wavelet decomposition parameters or the RF classifier parameters to each particular database.
Keywords :
audio databases; image classification; image resolution; speech recognition; visual databases; wavelet transforms; Random Forests classification method; audio-visual databases; lip reading system; visual speech classification scheme; wavelet decomposition parameters; wavelet-based features; Hidden Markov models; Mouth; Speech; Speech recognition; Visual databases; Visualization;
Conference_Titel :
Pattern Recognition (ICPR), 2014 22nd International Conference on
Conference_Location :
Stockholm
DOI :
10.1109/ICPR.2014.146