مرکز منطقه ای اطلاع رساني علوم و فناوري - Large vocabulary audio-visual speech recognition using active shape models

DocumentCode :

2874811

Title :

Large vocabulary audio-visual speech recognition using active shape models

Author :

Faruquie, Tanveer A. ; Majumdar, Abhik ; Rajput, Nitendra ; Subramaniam, L.V.

Author_Institution :

IBM India Res. Lab., New Delhi, India

Volume :

fYear :

2000

fDate :

2000

Firstpage :

106

Abstract :

Orthogonal information present in the video signal associated with the audio helps in improving the accuracy of a speech recognition system. Audio-visual speech recognition involves extraction of both the audio as well as visual features from the input signal. Extraction of visual parameters is done by the recognition of speech dependent features from the video sequence. The paper uses geometrical features to describe the lip shapes. Curve-based active shape models are used to extract the geometry. These geometrically represented visual parameters are used along with the audio cepstral features to perform an audio-visual classification. It is shown that the bimodal system presented gives an improvement in the classification results over classification using only the audio features

Keywords :

acoustic signal processing; feature extraction; geometry; image sequences; principal component analysis; signal classification; speech recognition; active shape models; audio cepstral features; audio-visual classification; bimodal system; geometrical features; geometrically represented visual parameters; large vocabulary audio-visual speech recognition; lip shapes; orthogonal information; speech dependent features; video sequence; Active shape model; Cepstral analysis; Data mining; Deformable models; Facial features; Feature extraction; Humans; Speech recognition; Video sequences; Vocabulary;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Pattern Recognition, 2000. Proceedings. 15th International Conference on

Conference_Location :

Barcelona

ISSN :

1051-4651

Print_ISBN :

0-7695-0750-6

Type :

conf

DOI :

10.1109/ICPR.2000.903496

Filename :

903496

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2874811