Information fusion techniques in Audio-Visual Speech Recognition

Author

H. Karabalkan;H. Erdogan

Author_Institution

M?hendislik ve Do?a Bilimleri Fak?ltesi, Sabanc? ?niversitesi, Turkey

fYear

2009

fDate

4/1/2009 12:00:00 AM

Firstpage

504

Lastpage

507

Abstract

It is well known that human perception of speech relies both on audio and visual information. However, the physiology of information fusion process in humans is still indefinite which attracts scientists´ attention to information fusion process for audio-visual speech recognition. In this work, a novel tandem hybrid approach is introduced for an efficient audio-visual speech recognition system and the performance of the proposed technique is experimentally compared with the widely used Multiple Stream Hidden Markov Model (MSHMM) approach.

Keywords

"Speech recognition","Hidden Markov models","Mel frequency cepstral coefficient","Discrete cosine transforms","Telecommunication standards","Streaming media","Humans","Linear discriminant analysis","Physiology","Gaussian processes"

Publisher

ieee

Conference_Titel

Signal Processing and Communications Applications Conference, 2009. SIU 2009. IEEE 17th

ISSN

2165-0608

Print_ISBN

978-1-4244-4435-9

Type

conf

DOI

10.1109/SIU.2009.5136443

Filename

5136443

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=3632053