Title :
Audio-visual content analysis for content-based video indexing
Author :
Tsekeridou, Sofia ; Pitas, Ioannis
Author_Institution :
Dept. of Inf., Aristotelian Univ. of Thessaloniki, Greece
Abstract :
An audio-visual content analysis method is presented, which analyzes both auditory and visual information sources and accounts for their inter-relations and coincidence to extract high-level semantic information. Both shot-based and object-based access to the visual information is employed. Due to the temporal nature of video, time has to be accounted for. Thus, time-constrained video labelling functions are generated. Audio source parsing leads to the extraction of a speaker identity mapping function over time. Visual source parsing results in the extraction of a talking face shot mapping function over time. Integration of the audio and visual mappings constrained by interaction rules leads to more detailed video content descriptions and even partial detection of its context
Keywords :
audio-visual systems; content-based retrieval; database indexing; multimedia databases; temporal databases; video databases; audio source parsing; audio-visual content analysis; content-based video indexing; high-level semantic information; object-based access; shot mapping function; shot-based access; speaker identity mapping function; talking face; temporal database; time-constrained video labelling; video content description; visual information; visual source parsing; Cepstral analysis; Content based retrieval; Data mining; Indexing; Informatics; Information analysis; Information retrieval; Labeling; Performance analysis; Speech analysis;
Conference_Titel :
Multimedia Computing and Systems, 1999. IEEE International Conference on
Conference_Location :
Florence
Print_ISBN :
0-7695-0253-9
DOI :
10.1109/MMCS.1999.779279