DocumentCode :
3179608
Title :
Speech retrieval for TV news programs by fusing the audio and video information
Author :
Gao, Xinbo ; Li, Jie ; Ji, Hongbing
Author_Institution :
Sch. of Electron. Eng., Xidian Univ., Xi´´an, China
Volume :
2
fYear :
2002
fDate :
26-30 Aug. 2002
Firstpage :
994
Abstract :
A typical news story contains a brief report by the anchor person(s) in the studio, as well as news footage in the field. Investigation shows that our recognizer performs better when indexing audio from the studio than that from the field. In order to automatically extract the "reliable" audio segments for speech retrieval, we attempt to detect studio-to-field transitions by means of video parsing. Our research is based on 146 news stories collected from Hong Kong TVB Jade station. Retrieval using the entire audio track gave (average inverse rank) AIR=0.759 while, with the incorporation of video parsing, we performed retrieval based only on the studio recordings, which produced AIR=0.765.
Keywords :
audio signal processing; database indexing; feature extraction; information retrieval; multimedia databases; speech processing; speech recognition; television production; Hong Kong TVB Jade station; TV news programs; audio indexing; audio video information fusion; automatic extraction; reliable audio segments; speech retrieval; studio-to-field transition detection; video parsing; Audio recording; Automatic speech recognition; Hidden Markov models; Indexing; Information retrieval; Speech recognition; TV; Testing; Video on demand; Video recording;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing, 2002 6th International Conference on
Print_ISBN :
0-7803-7488-6
Type :
conf
DOI :
10.1109/ICOSP.2002.1179955
Filename :
1179955
Link To Document :
بازگشت