DocumentCode
1702883
Title
Enhancement of VSR using low dimension visual feature
Author
Upadhyaya, Parag ; Farooq, Omar ; Varshney, Praveen ; Upadhyaya, Ajay
Author_Institution
Dept. of Electron. & Commun. Eng., SRI-Datia, Datia, India
fYear
2013
Firstpage
71
Lastpage
74
Abstract
This paper presents a study about the low dimension visual (LDV) space features and investigates the improvement in audio visual automatic speech recognition using different set of visual features. The experiment is divided into three sub-sections; in first phase the recognition is performed on 12 static DCT features; in second phase the recognition is performed for combination of 6 static and 6 dynamic features and in third phase the recognition is performed on 12 low dimension DCT feature. For this research work Hindi AMUAV (Aligarh Muslim University Audio-Visual) database was developed in which audio sample at 44.1 kHz and video sample at 25 frames per second was opted. Hidden Markov Model (HMM) tool kit with left-right HMMs modeled was used for recognition and an overall improvement of 26.04% in word recognition is achieved with LDV space features.
Keywords
audio-visual systems; discrete cosine transforms; feature extraction; hidden Markov models; speech recognition; Aligarh Muslim University; HMM; Hindi AMUAV database; LDV space features; VSR enhancement; audio sample; audio visual automatic speech recognition; audio-visual database; frequency 44.1 kHz; hidden Markov model; low dimension DCT feature; low dimension visual space features; static DCT features; visual speech recognition; word recognition; Databases; Discrete cosine transforms; Feature extraction; Hidden Markov models; Speech; Speech recognition; Visualization;
fLanguage
English
Publisher
ieee
Conference_Titel
Multimedia, Signal Processing and Communication Technologies (IMPACT), 2013 International Conference on
Conference_Location
Aligarh
Print_ISBN
978-1-4799-1202-5
Type
conf
DOI
10.1109/MSPCT.2013.6782090
Filename
6782090
Link To Document