مرکز منطقه ای اطلاع رساني علوم و فناوري - Voxel-based Viterbi Active Speaker Tracking (V-VAST) with best view selection for video lecture post-production

DocumentCode :

2163592

Title :

Voxel-based Viterbi Active Speaker Tracking (V-VAST) with best view selection for video lecture post-production

Author :

Kelly, Damien ; Kokaram, Anil ; Boland, Frank

Author_Institution :

Sigmedia Group, Trinity Coll. Dublin, Dublin, Ireland

fYear :

2011

fDate :

22-27 May 2011

Firstpage :

2296

Lastpage :

2299

Abstract :

An automated system is presented for reducing a multi-view lecture recording into a single view video containing a best view summary of active speakers. The system uses skin color detection and voxel-based analysis in locating likely speaker locations. Using time-delay estimates from multiple micro phones, speech activity is analyzed for each speaker position. The Viterbi algorithm is then used to estimate a track of the active speaker which maximizes the observed speech activity. This novel approach is termed Voxel-based Viterbi Active Speaker Tracking (V-VAST) and is shown to track speakers with an accuracy of 0.23m. Using the tracking information, the system then extracts from the available camera views the most frontal face view of the active speaker to display.

Keywords :

audio-visual systems; delays; maximum likelihood estimation; microphones; speech processing; video cameras; V-VAST; multiple microphone; multiview lecture recording reduction; skin color detection; speaker location; speech activity analysis; time-delay estimation; track estimation; video lecture post-production; voxel-based Viterbi active speaker tracking; Cameras; Face; Microphones; Skin; Speech; Three dimensional displays; Viterbi algorithm; Audio-Visual Tracking; Multi-camera; Multi-microphone; Time-Delay Estimation; Viterbi;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on

Conference_Location :

Prague

ISSN :

1520-6149

Print_ISBN :

978-1-4577-0538-0

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2011.5946941

Filename :

5946941

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2163592