Title :
Generating coherent natural language annotations for video streams
Author :
Khan, Muhammad Usman Ghani ; Lei Zhang ; Gotoh, Yusuke
Author_Institution :
Univ. of Sheffield, Sheffield, UK
fDate :
Sept. 30 2012-Oct. 3 2012
Abstract :
This contribution addresses generation of natural language annotations for human actions, behaviour and their interactions with other objects observed in video streams. The work starts with implementation of conventional image processing techniques to extract high level features for individual frames. Natural language description of the frame contents is produced based on high level features. Although feature extraction processes are erroneous at various levels, we explore approaches to put them together to produce a coherent description. For extending the approach to description of video streams, units of features are introduced to present coherent, smooth and well phrased descriptions by incorporating spatial and temporal information. Evaluation is made by calculating ROUGE scores between human annotated and machine generated descriptions.
Keywords :
feature extraction; natural language processing; video signal processing; video streaming; ROUGE scores; feature extraction; human action; human annotated description; image processing; machine generated description; natural language annotation; natural language description; video stream; Feature extraction; Humans; Legged locomotion; Natural languages; Streaming media; Video sequences; Visualization; Natural language description; Video annotation; Video processing; video feature units;
Conference_Titel :
Image Processing (ICIP), 2012 19th IEEE International Conference on
Conference_Location :
Orlando, FL
Print_ISBN :
978-1-4673-2534-9
Electronic_ISBN :
1522-4880
DOI :
10.1109/ICIP.2012.6467504