Title :
An approach to vowel recognition using 2DDWT based visual information of the lip region
Author :
Fattah, Shaikh Anowarul ; Rubaiyat, A.H.M. ; Hassan, Mohammad
Author_Institution :
Dept. of Electr. & Electron. Eng., Bangladesh Univ. of Eng. & Technol., Dhaka, Bangladesh
Abstract :
In this paper, a vowel recognition scheme using visual information is proposed based on two dimensional discrete wavelet transform (2D-DWT). First, a video frame corresponding to a steady vowel zone is selected utilizing the speech characteristics of audio frames. Next, a pixel-based method is proposed to identify the lip region of a given video frame, where intensity variation of different color planes is utilized. The 2D-DWT is then employed on a combined image plane extracted by using the weighted sum of red and green plane pixels of the lip image. Lower order wavelet coefficients obtained after second level decomposition and differences among those coefficients are used as proposed features. Leave one out cross validation technique is used to test the classification accuracy where a distance based classifier is used. Performance of the proposed method is tested on a publicly available standard audiovisual database and a high level of recognition accuracy is achieved using only extracted visual features.
Keywords :
discrete wavelet transforms; feature extraction; image classification; image colour analysis; speech recognition; video signal processing; 2DDWT based visual information; audio frames; combined image plane; distance based classifier; green plane pixels; leave one out cross validation technique; lip region; lower order wavelet coefficients; pixel-based method; red plane pixels; second level decomposition; speech characteristics; steady vowel zone; two dimensional discrete wavelet transform; video frame; visual feature extraction; vowel recognition scheme; Accuracy; Discrete wavelet transforms; Feature extraction; Image color analysis; Speech; Speech recognition; Visualization; classification; discrete wavelet transform; feature extraction; lip detection; video frame analysis; vowel recognition;
Conference_Titel :
Circuits and Systems (MWSCAS), 2014 IEEE 57th International Midwest Symposium on
Conference_Location :
College Station, TX
Print_ISBN :
978-1-4799-4134-6
DOI :
10.1109/MWSCAS.2014.6908608