Title :
Effective Video Annotation by Mining Visual Features and Speech Features
Author :
Tseng, Vincent S. ; Su, Ja-Hwung ; Chen, Chih-Jen
Author_Institution :
Nat. ChengKung Univ., Tainan
Abstract :
In the area of multimedia processing, a number of studies have been devoted to narrowing the gap between multimedia content and human sense. In fact, multimedia understanding is a difficult and challenging task even using machine-learning techniques. To deal with this challenge, in this paper, we propose an innovative method that employs data mining techniques and content-based paradigm to conceptualize videos. Mainly, our proposed method puts the focus on: (I) Construction of prediction models, namely speech-association model ModelSass and visual-statistical model ModelCRM, and (2) Fusion of prediction models to annotate unknown videos automatically. Without additional manual cost, discovered speech-association patterns can show the implicit relationships among the sequential images. On the other hand, visual features can atone for the inadequacy of speech-association patterns. Empirical evaluations reveal that our approach makes, on the average, the promising results than other methods for annotating videos.
Keywords :
data mining; speech processing; statistical analysis; video signal processing; content-based paradigm; data mining technique; multimedia processing; speech feature mining; speech-association model; video annotation; visual feature mining; visual-statistical model; Computer science; Content based retrieval; Costs; Data mining; Hidden Markov models; Humans; Image segmentation; Predictive models; Probability; Speech;
Conference_Titel :
Intelligent Information Hiding and Multimedia Signal Processing, 2007. IIHMSP 2007. Third International Conference on
Conference_Location :
Kaohsiung
Print_ISBN :
978-0-7695-2994-1
DOI :
10.1109/IIHMSP.2007.4457526