Title :
Gaze-contingent asr for spontaneous, conversational speech: An evaluation
Author :
Cooke, Neil ; Russell, Martin
Author_Institution :
Multi-modal Interaction Lab., Birmingham Univ., Birmingham
fDate :
March 31 2008-April 4 2008
Abstract :
There has been little work that attempts to improve the recognition of spontaneous, conversational speech by adding information from a loosely-coupled modality. This study investigated this idea by integrating information from gaze into an ASR system. A probabilistic framework for multimodal recognition was formalised and applied to the specific case of integrating gaze and speech. Gaze-contingent ASR systems were developed from a baseline ASR system by redistributing language model probability mass according to the visual attention. The best performing systems had similar Word Error Rates to the baseline ASR system and showed an increase in keyword spotting accuracy. The key finding was that performance improvements observed were due to increased recognition accuracy for words associated with the visual field but not the current focus of visual attention.
Keywords :
speech recognition; word processing; automatic speech recognition; gaze-contingent ASR; keyword spotting accuracy; language model probability mass; loosely-coupled modality; spontaneous conversational speech; visual attention; word error rates; Automatic speech recognition; Error analysis; Human computer interaction; Laboratories; Maximum likelihood decoding; Speech analysis; Speech recognition; User interfaces; Visual system; Vocabulary; Bayes procedures; Speech recognition; User interfaces; Visual system;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location :
Las Vegas, NV
Print_ISBN :
978-1-4244-1483-3
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2008.4518639