DocumentCode :
2645273
Title :
A Speaker Identification System for Video Content Analysis
Author :
Bi, Jing ; Liu, Shu-Chang
Author_Institution :
Beijing Univ. of Posts & Telecommun., Beijing
fYear :
2008
fDate :
15-17 Aug. 2008
Firstpage :
200
Lastpage :
203
Abstract :
Recently, more literatures proposed to apply audio content analysis techniques in content-based video parsing. This paper presents our current works on a speaker identification system for video content analysis. The system is different from normal ones in the following aspects: firstly, soundtrack extracted from video stream includes not only silence and speech, but also music and environmental sound; secondly, the number of speakers in video content are uncertain; thirdly, the presence of noise in the video can significantly deteriorate system performance. According to these considerations, our speaker identification system involves such basic parts: audio classification and segmentation using rule and support vector machine (SVM) based classifier; speech clustering using spectral clustering technique and speaker identification based on Gaussian mixture model (GMM); speech enhancement based on spectral subtraction. Experiments are carried on a database extracted from news, conversation and movie videos. The obtained results confirm the validity of the proposed system architecture.
Keywords :
Gaussian processes; content-based retrieval; pattern classification; pattern clustering; speaker recognition; speech enhancement; support vector machines; video signal processing; video streaming; Gaussian mixture model; audio classification; audio segmentation; content-based video parsing; rule-based classifier; speaker identification system; spectral clustering technique; speech clustering; support vector machine; video content analysis; video stream; Acoustic noise; Databases; Loudspeakers; Music; Speech enhancement; Streaming media; Support vector machine classification; Support vector machines; System performance; Working environment noise; Audio Classification and Segmentation; Speaker Identification; Speech Enhancement; Video Parsing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Information Hiding and Multimedia Signal Processing, 2008. IIHMSP '08 International Conference on
Conference_Location :
Harbin
Print_ISBN :
978-0-7695-3278-3
Type :
conf
DOI :
10.1109/IIH-MSP.2008.215
Filename :
4604039
Link To Document :
بازگشت