DocumentCode :
3165743
Title :
Insights into machine lip reading
Author :
Lan, Yuxuan ; Harvey, Richard ; Theobald, Barry-John
Author_Institution :
Sch. of Comput. Sci., Univ. of East Anglia, Norwich, UK
fYear :
2012
fDate :
25-30 March 2012
Firstpage :
4825
Lastpage :
4828
Abstract :
Computer lip-reading is one of the great signal processing challenges. Not only is the signal noisy, it is variable. However it is almost unknown to compare the performance with human lip-readers. Partly this is because of the paucity of human lip-readers and partly because most automatic systems only handle data that are trivial and therefore not representative of human speech. Here we generate a multiview dataset using connected words that can be analysed by an automatic system, based on linear predictive trackers and active appearance models, and human lip-readers. The automatic system we devise has a viseme accuracy of ≈ 46% which is comparable to poor professional human lip-readers. However, unlike human lip-readers our system is good at guessing its fallibility.
Keywords :
signal processing; speech processing; speech recognition; active appearance model; automatic system; computer lip reading; human speech; linear predictive trackers; machine lip reading; multiview dataset; professional human lip readers; signal noisy; signal processing; viseme accuracy; Accuracy; Active appearance model; Hidden Markov models; Humans; Speech; Speech recognition; Visualization; automated lip-reading; speech recognition; visual speech;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
ISSN :
1520-6149
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2012.6288999
Filename :
6288999
Link To Document :
بازگشت