DocumentCode :
2790637
Title :
Speaker independent visual-only language identification
Author :
Newman, Jacob L. ; Cox, Stephen J.
Author_Institution :
Sch. of Comput. Sci., Univ. of East Anglia, Norwich, UK
fYear :
2010
fDate :
14-19 March 2010
Firstpage :
5026
Lastpage :
5029
Abstract :
We describe experiments in visual-only language identification (VLID), in which only lip shape, appearance and motion are used to determine the language of a spoken utterance. In previous work, we had shown that this is possible in speaker-dependent mode, i.e. identifying the language spoken by a multi-lingual speaker. Here, by appropriately modifying techniques that have been successful in audio language identification, we extend the work to discriminating two languages in speaker-independent mode. Our results indicate that even with viseme accuracy as low as about 34%, reasonable discrimination can be obtained. A simulation of degraded accuracy viseme recognition performance indicates that high VLID accuracy should be achievable with viseme recognition errors of the order of 50%.
Keywords :
audio signal processing; image motion analysis; image recognition; natural language processing; speaker recognition; audio language identification; lip shape; multilingual speaker; speaker independent visual-only language identification; speaker-dependent mode; viseme recognition; Cameras; Databases; Degradation; Jacobian matrices; Natural languages; Shape; Speech processing; Speech recognition; Visual communication; Working environment noise; language identification; lip-reading;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
ISSN :
1520-6149
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2010.5495071
Filename :
5495071
Link To Document :
بازگشت