DocumentCode :
427567
Title :
Robust multi-modal person identification with tolerance of facial expression
Author :
Fox, Niall A. ; Reilly, Richard B.
Author_Institution :
Dept. of Electron. & Electr. Eng., Dublin Coll. Univ., Ireland
Volume :
1
fYear :
2004
fDate :
10-13 Oct. 2004
Firstpage :
580
Abstract :
The research presented in This work describes audio-visual speaker identification experiments carried out on a large data set of 251 subjects. Both the audio and visual modeling is carried out using hidden Markov models. The visual modality uses the speaker´s lip information. The audio and visual modalities are both degraded to emulate a train/test mismatch. The fusion method employed adapts automatically by using classifier score reliability estimates of both modalities to give improved audio-visual accuracies at all tested levels of audio and visual degradation, compared to the individual audio or visual modality accuracies. A maximum visual identification accuracy of 86% was achieved. This result is comparable to the performance of systems using the entire face, and suggests the hypothesis that the system described would be tolerant to varying facial expression, since only the information around the speaker´s lips is employed.
Keywords :
face recognition; hidden Markov models; speech recognition; audio-visual speaker identification; facial expression; hidden Markov models; maximum visual identification; multimodal person identification; visual modality; Audio databases; Hidden Markov models; Identification of persons; Image databases; Signal processing; Visual databases;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems, Man and Cybernetics, 2004 IEEE International Conference on
Conference_Location :
The Hague
ISSN :
1062-922X
Print_ISBN :
0-7803-8566-7
Type :
conf
DOI :
10.1109/ICSMC.2004.1398362
Filename :
1398362
Link To Document :
بازگشت