DocumentCode :
1749628
Title :
Optimal weighting of posteriors for audio-visual speech recognition
Author :
Heckmann, Martin ; Berthommier, Frédéric ; Kroschel, Kristian
Author_Institution :
Inst. de la Commuinication Parlee, Inst. Nat. Polytech. de Grenoble, France
Volume :
1
fYear :
2001
fDate :
2001
Firstpage :
161
Abstract :
We investigate the fusion of audio and video a posteriori phonetic probabilities in a hybrid ANN/HMM audio-visual speech recognition system. Three basic conditions to the fusion process are stated and implemented in a linear and a geometric weighting scheme. These conditions are the assumption of conditional independence of the audio and video data and the contribution of only one of the two paths when the SNR is very high or very low, respectively. In the case of the geometric weighting a new weighting scheme is developed whereas the linear weighting follows the full combination approach as employed in multi-stream recognition. We compare these two new concepts in audio-visual recognition to a rather standard approach known from the literature. Recognition tests were performed in a continuous number recognition task on a single speaker database containing 1712 utterances with two different types of noise added
Keywords :
Gaussian noise; audio signal processing; hidden Markov models; neural nets; probability; sensor fusion; speech recognition; video signal processing; white noise; a posteriori phonetic probabilities; audio-visual speech recognition; continuous number recognition task; full combination approach; geometric weighting scheme; hybrid ANN/HMM system; linear weighting scheme; optimal weighting; posteriors; single speaker database; Acoustic noise; Audio databases; Feature extraction; Hidden Markov models; Lips; Performance evaluation; Spatial databases; Speech recognition; Streaming media; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
Conference_Location :
Salt Lake City, UT
ISSN :
1520-6149
Print_ISBN :
0-7803-7041-4
Type :
conf
DOI :
10.1109/ICASSP.2001.940792
Filename :
940792
Link To Document :
بازگشت