Title : 
Modeling focus of attention for meeting indexing based on multiple cues
         
        
            Author : 
Stiefelhagen, Rainer ; Yang, Jie ; Waibel, Alex
         
        
            Author_Institution : 
Inst. for Logic, Complexity & Deduction Syst., Univ. of Karlsruhe, Germany
         
        
        
        
        
            fDate : 
7/1/2002 12:00:00 AM
         
        
        
        
            Abstract : 
A user´s focus of attention plays an important role in human-computer interaction applications, such as a ubiquitous computing environment and intelligent space, where the user´s goal and intent have to be continuously monitored. We are interested in modeling people´s focus of attention in a meeting situation. We propose to model participants´ focus of attention from multiple cues. We have developed a system to estimate participants´ focus of attention from gaze directions and sound sources. We employ an omnidirectional camera to simultaneously track participants´ faces around a meeting table and use neural networks to estimate their head poses. In addition, we use microphones to detect who is speaking. The system predicts participants´ focus of attention from acoustic and visual information separately. The system then combines the output of the audio- and video-based focus of attention predictors. We have evaluated the system using the data from three recorded meetings. The acoustic information has provided 8% relative error reduction on average compared to only using one modality. The focus of attention model can be used as an index for a multimedia meeting record. It can also be used for analyzing a meeting.
         
        
            Keywords : 
business data processing; image motion analysis; multilayer perceptrons; multimedia systems; speech recognition; tracking; user interfaces; audio; face tracking; focus of attention; gaze directions; head pose estimation; human-computer interaction; intelligent space; meeting indexing; microphones; multilayer perceptron; multimedia meeting record; multiple cues; neural networks; omnidirectional camera; sound sources; speech recognition; ubiquitous computing; video; Acoustic signal detection; Application software; Cameras; Collaborative work; Face detection; Indexing; Microphones; Monitoring; Neural networks; Ubiquitous computing;
         
        
        
            Journal_Title : 
Neural Networks, IEEE Transactions on
         
        
        
        
        
            DOI : 
10.1109/TNN.2002.1021893