مرکز منطقه ای اطلاع رساني علوم و فناوري - LipActs: Efficient representations for visual speakers

DocumentCode :

3198441

Title :

LipActs: Efficient representations for visual speakers

Author :

Zavesky, Eric

Author_Institution :

AT&T Labs. Res., Middletown, NJ, USA

fYear :

2011

fDate :

11-15 July 2011

Firstpage :

Lastpage :

Abstract :

Video-based lip activity analysis has been successfully used for assisting speech recognition for almost a decade. Surprisingly, this same capability has not been heavily used for near real-time visual speaker retrieval and verification, due to tracking complexity, inadequate or difficult feature determination, and the need for a large amount of pre-labeled data for model training. This paper explores the performance of several solutions using modern histogram of oriented gradients (HOG) features, several quantization techniques, and analyzes the benefits of temporal sampling and spatial partitioning to derive a representation called LipActs. Two datasets are used for evaluation: one with 81 participants derived from varying quality YouTube content and one with 3 participants derived from a forward facing mobile video camera with 10 varied lighting and capture angle environments. Over these datasets, LipActs with a moderate number of pooled temporal frames and multi-resolution spatial quantization, offer an improvement of 37-73% over raw features when optimizing for lowest equal error rate (EER).

Keywords :

computational complexity; speech recognition; video signal processing; EER; HOG; LipActs; YouTube content; equal error rate; histogram of oriented gradients; mobile video camera; quantization techniques; spatial partitioning; speech recognition; temporal sampling; tracking complexity; video based lip activity analysis; visual speaker retrieval; Detectors; Face; Feature extraction; Histograms; Quantization; Visualization; Vocabulary; feature extraction; learning systems; verification; video analysis;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Multimedia and Expo (ICME), 2011 IEEE International Conference on

Conference_Location :

Barcelona

ISSN :

1945-7871

Print_ISBN :

978-1-61284-348-3

Electronic_ISBN :

1945-7871

Type :

conf

DOI :

10.1109/ICME.2011.6012102

Filename :

6012102

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3198441