Title :
Profile View Lip Reading
Author :
Kumar, Kush ; Tsuhan Chen ; Stern, Richard M.
Author_Institution :
Dept. of Electr. & Comput. Eng., Carnegie Mellon Univ., Pittsburgh, PA, USA
Abstract :
In this paper, we introduce profile view (PV) lip reading, a scheme for speaker-dependent isolated word speech recognition. We provide historic motivation for PV from the importance of profile images in facial animation for lip reading, and we present feature extraction schemes for PV as well as for the traditional frontal view (FV) approach. We compare lip reading results for PV and FV, which demonstrate a significant improvement for PV over FV. We show improvement in speech recognition with the integration of audio and visual features. We also found it advantageous to process the visual features over a longer duration than the duration marked by the endpoints of the speech utterance.
Keywords :
feature extraction; image processing; speech recognition; facial animation; feature extraction; frontal view approach; profile images; profile view lip reading; speaker-dependent isolated word speech recognition; Audio databases; Cellular phones; Discrete cosine transforms; Feature extraction; Humans; Image motion analysis; Speech recognition; Speech synthesis; Visual databases; Visualization; Audiovisual speech recognition; Profile view; Speechreading; Visual feature extraction;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
Conference_Location :
Honolulu, HI
Print_ISBN :
1-4244-0727-3
DOI :
10.1109/ICASSP.2007.366941