مرکز منطقه ای اطلاع رساني علوم و فناوري - Low-dimensional motion features for audio-visual speech recognition

DocumentCode :

705876

Title :

Low-dimensional motion features for audio-visual speech recognition

Author :

Valles Carboneras, Andres ; Gurban, Mihai ; Thiran, Jean-Philippe

Author_Institution :

E.T.S.I. de Telecomun., Univ. Politec. de Madrid, Madrid, Spain

fYear :

2007

fDate :

3-7 Sept. 2007

Firstpage :

297

Lastpage :

301

Abstract :

Audio-visual speech recognition promises to improve the performance of speech recognizers, especially when the audio is corrupted, by adding information from the visual modality, more specifically, from the video of the speaker. However, the number of visual features that are added is typically bigger than the number of audio features, for a small gain in accuracy. We present a method that shows gains in performance comparable to the commonly-used DCT features, while employing a much smaller number of visual features based on the motion of the speaker´s mouth. Motion vector differences are used to compensate for errors in the mouth tracking. This leads to a good performance even with as few as 3 features. The advantage of low-dimensional features is that a good accuracy can be obtained with relatively little training data, while also increasing the speed of both training and testing.

Keywords :

audio signal processing; audio-visual systems; discrete cosine transforms; speaker recognition; DCT features; audio features; audio-visual speech recognition; low-dimensional features; low-dimensional motion features; motion vector; mouth tracking; speaker mouth; visual features; visual modality; Discrete cosine transforms; Feature extraction; Hidden Markov models; Mouth; Optical imaging; Speech recognition; Visualization;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Signal Processing Conference, 2007 15th European

Conference_Location :

Poznan

Print_ISBN :

978-839-2134-04-6

Type :

conf

Filename :

7098812

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=705876