مرکز منطقه ای اطلاع رساني علوم و فناوري - Boosting multi-modal camera selection with semantic features

DocumentCode :

3632748

Title :

Boosting multi-modal camera selection with semantic features

Author :

Benedikt Hornler;Dejan Arsic;Bjon Schuller;Gerhard Rigoll

Author_Institution :

Technische Universit?t M?nchen, Institute for Human-Machine-Communication, 80290 Munich, Germany

fYear :

2009

fDate :

6/1/2009 12:00:00 AM

Firstpage :

1298

Lastpage :

1301

Abstract :

In this work semantic features are used to improve the results of the camera selection. These semantic features are group action, person action and person speaking. For this purpose low level acoustic and visual features are combined with high level semantic ones. After the feature fusion, a segmentation and classification are performed by hidden Markov models. The evaluation shows that an absolute improvement of 6.5% can be achieved. The frame error rate is reduced to 38.1% by using acoustic and all semantic features. The best model using only low level features achieves a frame error rate of 44.6%, which is the best one reported on this data set.

Keywords :

"Boosting","Hidden Markov models","Videoconference","Smart cameras","Error analysis","Streaming media","Minutes","Microphone arrays","Mel frequency cepstral coefficient","Image sequences"

Publisher :

ieee

Conference_Titel :

Multimedia and Expo, 2009. ICME 2009. IEEE International Conference on

ISSN :

1945-7871

Print_ISBN :

978-1-4244-4290-4

Electronic_ISBN :

1945-788X

Type :

conf

DOI :

10.1109/ICME.2009.5202740

Filename :

5202740

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3632748