DocumentCode
252471
Title
An approach to vowel recognition using 2DDWT based visual information of the lip region
Author
Fattah, Shaikh Anowarul ; Rubaiyat, A.H.M. ; Hassan, Mohammad
Author_Institution
Dept. of Electr. & Electron. Eng., Bangladesh Univ. of Eng. & Technol., Dhaka, Bangladesh
fYear
2014
fDate
3-6 Aug. 2014
Firstpage
1089
Lastpage
1092
Abstract
In this paper, a vowel recognition scheme using visual information is proposed based on two dimensional discrete wavelet transform (2D-DWT). First, a video frame corresponding to a steady vowel zone is selected utilizing the speech characteristics of audio frames. Next, a pixel-based method is proposed to identify the lip region of a given video frame, where intensity variation of different color planes is utilized. The 2D-DWT is then employed on a combined image plane extracted by using the weighted sum of red and green plane pixels of the lip image. Lower order wavelet coefficients obtained after second level decomposition and differences among those coefficients are used as proposed features. Leave one out cross validation technique is used to test the classification accuracy where a distance based classifier is used. Performance of the proposed method is tested on a publicly available standard audiovisual database and a high level of recognition accuracy is achieved using only extracted visual features.
Keywords
discrete wavelet transforms; feature extraction; image classification; image colour analysis; speech recognition; video signal processing; 2DDWT based visual information; audio frames; combined image plane; distance based classifier; green plane pixels; leave one out cross validation technique; lip region; lower order wavelet coefficients; pixel-based method; red plane pixels; second level decomposition; speech characteristics; steady vowel zone; two dimensional discrete wavelet transform; video frame; visual feature extraction; vowel recognition scheme; Accuracy; Discrete wavelet transforms; Feature extraction; Image color analysis; Speech; Speech recognition; Visualization; classification; discrete wavelet transform; feature extraction; lip detection; video frame analysis; vowel recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Circuits and Systems (MWSCAS), 2014 IEEE 57th International Midwest Symposium on
Conference_Location
College Station, TX
ISSN
1548-3746
Print_ISBN
978-1-4799-4134-6
Type
conf
DOI
10.1109/MWSCAS.2014.6908608
Filename
6908608
Link To Document