مرکز منطقه ای اطلاع رساني علوم و فناوري - Impact of each camera on multiple camera visual speech recognizer using ANOVA: A brief study

Abstract :

Multiple camera fusion technique is an imperative part of multi-camera computer vision applications. Visual modality plays a vital role in computer vision systems when the acoustic modality is corrupted by the background noise. Multiple camera protocol allows the user to move freely and can provide complementary information to each other. This study shows the influence of each camera on visual speech recognizer using the one-way analysis of variance (ANOVA). We choose a real world four cameras audio-visual corpus “AVICAR” for this study. ANOVA is applied to the each pair of the camera to explore the effect of different viewing angle. This ANOVA test shows the influence of side and central faced camera on AVICAR visual speech recognizer (VSR). Based on the ANOVA F-statistics test multiple camera streams are fused into one visual feature vector. Dynamic visual speech information is captured using Motion History Images (MHI). Zernike Moments (ZM) are used as the visual feature to carry out the study. Four camera visual features show ample improvement over single camera visual features across all driving condition.