مرکز منطقه ای اطلاع رساني علوم و فناوري - Speaker dependent visual word recognition by using sequential mouth shape codes

DocumentCode :

600144

Title :

Speaker dependent visual word recognition by using sequential mouth shape codes

Author :

Tasaka, Takafumi ; Hamada, Nozomu

Author_Institution :

Dept. of Syst. Design Eng., Keio Univ., Yokohama, Japan

fYear :

2012

fDate :

4-7 Nov. 2012

Firstpage :

Lastpage :

101

Abstract :

Visual speech recognition or lip reading is an approach for noise robust speech recognition by adding speaker´s visual cues to audio information. Basically visual-only speech recognition is applicable to speaker verification and multimedia interface for supporting speaking impaired person. The sequential mouth-shape code method is an effective approach of lip reading for particularly uttered Japanese words by utilizing two kinds of distinctive mouth shapes, known as first and last mouth shapes, appeared intermittently. One advantage of this method is its low computational burden for the learning and word registration processes. This paper proposes a novel word lip recognition system by detecting and determining initial mouth-shape codes to recognize uttering consonants. The proposed method eventually is able to discriminate different words consisting of the same sequential vowel codes though containing different consonant codes. The conducted experiments demonstrate that the proposed system provides higher recognition rate than the conventional ones.

Keywords :

audio signal processing; multimedia computing; speech coding; speech recognition; distinctive mouth shape; lip reading; multimedia interface; noise robust speech recognition; sequential mouth shape code; sequential mouth-shape code method; sequential vowel code; speaker dependent visual word recognition; speaker verification; speaking impaired person; uttered Japanese word; uttering consonant recognition; visual cue; visual speech recognition; visual-only speech recognition; word lip recognition system; word registration process; Feature extraction; Image recognition; Mouth; Shape; Speech recognition; Trajectory; Visualization; audio-visual speech recognition; key frame exraction; lip reading; mouth-shape code; visual speech recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Intelligent Signal Processing and Communications Systems (ISPACS), 2012 International Symposium on

Conference_Location :

New Taipei

Print_ISBN :

978-1-4673-5083-9

Electronic_ISBN :

978-1-4673-5081-5

Type :

conf

DOI :

10.1109/ISPACS.2012.6473460

Filename :

6473460

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=600144