DocumentCode :
2869457
Title :
Intensity/location normalization for automatic lipreading
Author :
Tanaka, Akiji ; Vanegas, Oscar ; Tokuda, Keiichi ; Kitamura, Tadashi
Author_Institution :
Dept. of Comput. Sci., Nagoya Inst. of Technol., Japan
Volume :
2
fYear :
1998
fDate :
1998
Firstpage :
920
Abstract :
This paper describes intensity and location normalization for the improvement of the performance of a speech recognition system by using the visual information in bimodal speech recognition. In conventional speech recognition, many methods have been proposed for normalization of channel characteristics and speaker individuality. In this study, two methods similar to CMN and SAT were proposed for the intensity and location normalization respectively, and two kinds of feature vectors, a subsampled image and a 2D-DCT were compared. Experimental results show that the recognition rates have been very much improved by the normalization techniques
Keywords :
discrete cosine transforms; feature extraction; image recognition; image sampling; image sequences; speech recognition; 2D-DCT; automatic lipreading; bimodal speech recognition; channel characteristic normalization; feature vectors; image sequences; intensity/location normalization; recognition rate; speaker individuality; speech recognition system; subsampled image; visual information; Cepstral analysis; Character recognition; Computer science; Discrete cosine transforms; Error analysis; Image recognition; Image sequences; Lips; Robustness; Speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing Proceedings, 1998. ICSP '98. 1998 Fourth International Conference on
Conference_Location :
Beijing
Print_ISBN :
0-7803-4325-5
Type :
conf
DOI :
10.1109/ICOSP.1998.770762
Filename :
770762
Link To Document :
بازگشت