DocumentCode :
3738530
Title :
High level visual and paralinguistic features extraction and their correlation with user engagement
Author :
Fasih Haider;Fahim A. Salim;Saturnino Luz;Owen Conlan;Nick Campbell
Author_Institution :
ADAPT Centre, School of Computer Science and Statistics, Trinity College Dublin, Ireland
fYear :
2015
Firstpage :
326
Lastpage :
331
Abstract :
As more and more audio-visual content such as talks, lectures and presentations is made available online, it becomes increasingly difficult for prospective viewers of such content to assess which videos they might find interesting or engaging. Automatic classification of content as engaging versus non-engaging might help viewers cope with this situation, and presenters gauge their presentation skills. In addition, automatic classification could be useful for a variety of applications, including recommendation and personalized video segmentation. This paper explores some camera views (close up and distance shots etc.) along with paralinguistic features which can be used to predict viewer engagement, and give feedback to speakers as to whether and why their talk is engaging or not. The TED talk data set (1340 videos) and user engagement ratings are used in this study. This paper also sheds lights on how these engagement ratings are correlated with each other and with the liveliness of speech.
Keywords :
"Videos","Feature extraction","Correlation","Speech","Context","Analysis of variance","Cameras"
Publisher :
ieee
Conference_Titel :
Signal Processing and Information Technology (ISSPIT), 2015 IEEE International Symposium on
Type :
conf
DOI :
10.1109/ISSPIT.2015.7394353
Filename :
7394353
Link To Document :
بازگشت