Title :
CENSREC-2-AV: An evaluation framework for bimodal speech recognition in real environments
Author :
Ukai, Naoya ; Kawasaki, T. ; Tamura, Shinji ; Hayamizu, Satoru ; Miyajima, Chiyomi ; Kitaoka, Norihide ; Takeda, Kenji
Author_Institution :
Fac. of Eng., Gifu Univ., Gifu, Japan
Abstract :
In this paper, we introduce a bimodal speech recognition corpus in real environments. In recent years, speech recognition technology has been used in noisy conditions. Therefore, it becomes necessary to achieve higher recognition accuracy in real environments. As one of the solutions, bimodal speech recognition using audio and non-audio information is getting studied. However, there are few databases which can be used to evaluate the bimodal speech recognition in real environments. In this paper, we introduce CENSREC-2-AV we have been working to built, as a new bimodal speech recognition corpus. CENSREC-2-AV is one of the databases of the CEN-SREC project; we provided a similar corpus CENSREC-1-AV as a database for bimodal speech recognition for additive noises. In these corpora, there are speech data and lip images. Researchers can evaluate a bimodal speech recognition method built using CENSREC-1-AV which consists of clean data, in real environments by using CENSREC-2-AV.
Keywords :
audio databases; image recognition; speech recognition; visual databases; CENSREC-1-AV database; CENSREC-2-AV database; additive noise; audio information; bimodal speech recognition; lip image; nonaudio information; speech data; Databases; Face; Noise measurement; Optical imaging; Speech; Speech recognition; Visualization; CENSREC; audio-visual speech corpus; bimodal speech recognition; real environment;
Conference_Titel :
Speech Database and Assessments (Oriental COCOSDA), 2012 International Conference on
Conference_Location :
Macau
Print_ISBN :
978-1-4673-2811-1
Electronic_ISBN :
978-1-4673-2812-8
DOI :
10.1109/ICSDA.2012.6422476