DocumentCode :
3447429
Title :
Viewing angle estimation for multi-pose AVASR system based on PCNN
Author :
Mengjun Wang ; Xiangling Wang ; Gang Li
Author_Institution :
Sch. of Inf. Eng., HeBei Univ. of Technol., Tianjin, China
fYear :
2012
fDate :
16-18 Oct. 2012
Firstpage :
224
Lastpage :
227
Abstract :
In traditional multi-views audio-visual automatic speech recognition (AVASR) system, Projecting processing is adopted to projecting the features into a uniform pose, which will bring more computation. So different from the past investigations, a different research method is adopted. In this method, different views lipreading used different lip vectors; viewing angle estimation is before feature extraction. Pulse Coupled Neural Network (PCNN) is used to extract features in the gray image sequences of visual speech to estimate the viewing angle. Time series, Entropy series, Logarithm series, and Standard deviation are considered as the feature vector. Experiments are carried out based on Mean Square Error (MSE) in a small database for speaker-dependent case. Experiment results show that feature vector based on PCNN can estimate the viewing angle: 0°, 45°, and 90°. The maximum rate of accurate classification can be reached 95.64% based on Logarithm series.
Keywords :
audio signal processing; entropy; feature extraction; image sequences; mean square error methods; neural nets; pose estimation; speech recognition; time series; MSE; PCNN; entropy series; feature extraction; feature vector; gray image sequences; lip vectors; lipreading; logarithm series; mean square error; multipose AVASR system; multiviews audio-visual automatic speech recognition; projecting processing; pulse coupled neural network; speaker-dependent case; standard deviation; time series; viewing angle estimation; Entropy; Face; Feature extraction; Neurons; Standards; Support vector machine classification; Vectors; Mean Square Error; Pulse Coupled Neural Network (PCNN); Viewing angle estimation; audio-visual automatic speech recognition (AVASR);
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Image and Signal Processing (CISP), 2012 5th International Congress on
Conference_Location :
Chongqing
Print_ISBN :
978-1-4673-0965-3
Type :
conf
DOI :
10.1109/CISP.2012.6469913
Filename :
6469913
Link To Document :
بازگشت