مرکز منطقه ای اطلاع رساني علوم و فناوري - Speech retrieval for TV news programs by fusing the audio and video information

DocumentCode :

3179608

Title :

Speech retrieval for TV news programs by fusing the audio and video information

Author :

Gao, Xinbo ; Li, Jie ; Ji, Hongbing

Author_Institution :

Sch. of Electron. Eng., Xidian Univ., Xi´´an, China

Volume :

fYear :

2002

fDate :

26-30 Aug. 2002

Firstpage :

994

Abstract :

A typical news story contains a brief report by the anchor person(s) in the studio, as well as news footage in the field. Investigation shows that our recognizer performs better when indexing audio from the studio than that from the field. In order to automatically extract the "reliable" audio segments for speech retrieval, we attempt to detect studio-to-field transitions by means of video parsing. Our research is based on 146 news stories collected from Hong Kong TVB Jade station. Retrieval using the entire audio track gave (average inverse rank) AIR=0.759 while, with the incorporation of video parsing, we performed retrieval based only on the studio recordings, which produced AIR=0.765.

Keywords :

audio signal processing; database indexing; feature extraction; information retrieval; multimedia databases; speech processing; speech recognition; television production; Hong Kong TVB Jade station; TV news programs; audio indexing; audio video information fusion; automatic extraction; reliable audio segments; speech retrieval; studio-to-field transition detection; video parsing; Audio recording; Automatic speech recognition; Hidden Markov models; Indexing; Information retrieval; Speech recognition; TV; Testing; Video on demand; Video recording;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Signal Processing, 2002 6th International Conference on

Print_ISBN :

0-7803-7488-6

Type :

conf

DOI :

10.1109/ICOSP.2002.1179955

Filename :

1179955

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3179608