Title :
Joint audio-visual processing, representation and indexing of TV news programmes
Author :
Zdansky, Jindrich ; Chaloupka, Josef ; Nouza, Jan
Author_Institution :
Inst. of Inf. Technol. & Electron., Tech. Univ. of Liberec, Liberec
Abstract :
In the paper we present a complex platform for automatic processing of Czech TV news programmes. Its audio processing module provides text transcription in form of metadata that contain information about spoken content, speaker identities, used pronunciation, word positions and intonation. The video processing module provides pictures representing individual video scenes and information about detected and possibly recognized human faces. The audio and video data are merged into single XML files that are indexed and stored in a searchable database. A simple Web-based search engine can be used to retrieve information from the database that recently contain more than 1800 hours of transcribed programmes from Czech CT24 station.
Keywords :
XML; audio databases; audio signal processing; database indexing; digital television; face recognition; image representation; meta data; speaker recognition; text analysis; video databases; video retrieval; video signal processing; Czech TV news programme indexing; Web-based search engine; XML file; human face detection; human face recognition; information retrieval; joint audio-visual processing; meta data; pronunciation; searchable database; speaker identity; spoken content; text transcription; video scene representation; word position; Audio databases; Face detection; Face recognition; Humans; Indexing; Information retrieval; Layout; Search engines; TV; XML;
Conference_Titel :
Multimedia Signal Processing, 2008 IEEE 10th Workshop on
Conference_Location :
Cairns, Qld
Print_ISBN :
978-1-4244-2294-4
Electronic_ISBN :
978-1-4244-2295-1
DOI :
10.1109/MMSP.2008.4665213