Title :
Educational documentary video segmentation and access through combination of visual, audio and text understanding
Author :
Dong, Aijuan ; Li, Honglin
Author_Institution :
Dept. of Comput. Sci., North Dakota State Univ., Fargo, ND
Abstract :
Educational documentary videos play an important role in enriching learning experience. However, due to unstructured and linear features, documentary videos are much more difficult to access than text-based documents and have not been effectively utilized. In this paper, we propose a multimodal, hierarchical documentary video segmentation procedure based on image, audio and text understanding. The coincidence of scene-level audio breaks and text (transcript) breaks from domain independent text segmentation determines documentary video scenes/paragraphs. Each video scene/paragraph is further segmented into video shots based on video visual features. To effectively utilize composite documentary video learning materials generated, we propose a documentary video access platform that supports hierarchical organization of video content, multimodal presentation of information, augmented video content and multi-level flexible search. A prototype platform is implemented to demonstrate the idea
Keywords :
audio signal processing; content-based retrieval; image segmentation; text analysis; video retrieval; video signal processing; audio understanding; augmented video content; documentary video access; educational documentary video segmentation; multi-level flexible search; scene-level audio breaks; text breaks; text understanding; visual understanding; Composite materials; Computer science; Educational programs; Image segmentation; Layout; Neodymium; Prototypes; TV; User interfaces; Vocabulary;
Conference_Titel :
Signal Processing and Information Technology, 2005. Proceedings of the Fifth IEEE International Symposium on
Conference_Location :
Athens
Print_ISBN :
0-7803-9313-9
DOI :
10.1109/ISSPIT.2005.1577174