DocumentCode :
2234677
Title :
A real-time lip localization and tacking for lip reading
Author :
WenJuan, Yao ; YaLing, Liang ; Minghui, Du
Author_Institution :
Sch. of Electron. & Inf. Eng., South China Univ. of Technol., Guangzhou, China
Volume :
6
fYear :
2010
fDate :
20-22 Aug. 2010
Abstract :
Most automatic speech recognition systems have concentrated exclusively on the acoustic speech signal, and therefore they are susceptible to acoustic noise. The benefits from visual speech cues have motivated significant interest in automatic lip-reading, which aims at improving automatic speech recognition by exploiting informative visual features of a speaker´s mouth region, which means speaker lip motion stands out as the most linguistically visual feature. In this paper, we present a new improved robust lip location and tracking approach, aims at improving the lip-reading accuracy. Lip regions of interest are detected by a new method, combining with Intel Open source (OpenCV). In this new method, we analyze the distribution relationship between faces, eyes and mouth, and then the mouth region can be easily located. It can be proved as an effective method for lip tracking. In the subsequent step, color space is transferred to Lab from RGB color space, and a component of Lab color space is used for extracting lip segmentation and tracking lip region more accurately and efficiently from video sequences of a speaker´s talking face in different lighting conditions, and with different lip shapes and head poses. Extensive experiments show that our proposed method can achieve superior performance to other similar lip tracking approaches, and then can be effectively integrated in lip-reading or visual speech recognition systems.
Keywords :
image colour analysis; image recognition; image segmentation; image sequences; speech recognition; Intel Open source; Lab color space; OpenCV; RGB color space; automatic speech recognition systems; lip reading; lip segmentation; lip tacking; real-time lip localization; speaker lip motion; video sequences; visual speech recognition systems; Color; Lighting; Lips; Mouth; Visualization; OpenCV; a component; lip tracking;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advanced Computer Theory and Engineering (ICACTE), 2010 3rd International Conference on
Conference_Location :
Chengdu
ISSN :
2154-7491
Print_ISBN :
978-1-4244-6539-2
Type :
conf
DOI :
10.1109/ICACTE.2010.5579830
Filename :
5579830
Link To Document :
بازگشت