DocumentCode
3225349
Title
Automatic Audio Classification and Speaker Identification for Video Content Analysis
Author
Liu, Shu-Chang ; Bi, Jing ; Jia, Zhi-Qiang ; Chen, Rui ; Chen, Jie ; Zhou, Min-Min
Author_Institution
Beijing Univ. of Posts & Telecommun., Beijing
Volume
2
fYear
2007
fDate
July 30 2007-Aug. 1 2007
Firstpage
91
Lastpage
96
Abstract
Recently, more literatures proposed to apply audio content analysis techniques in content-based video parsing. This paper presents our works on audio classification and speaker identification techniques for video content analysis. Firstly, soundtrack extracted from video stream is partitioned into homogeneous segments using rule and support vector machine(SVM) based classifier. Secondly, fixed-length speech clips randomly selected from speech segments are clustered into several clusters based on spectral clustering techniques. The clustered speech feature datasets initialize and train Gaussian mixture model(GMM) for each speaker. Finally, the trained GMMs accomplish speaker identification. Experimental results confirm the validity of the proposed scheme.
Keywords
Gaussian processes; speaker recognition; support vector machines; video signal processing; Gaussian mixture model; audio content analysis techniques; automatic audio classification; content-based video parsing; fixed-length speech clips; homogeneous segments; speaker identification techniques; spectral clustering techniques; speech feature datasets; support vector machine; video content analysis; video stream; Artificial intelligence; Cepstral analysis; Content based retrieval; Data mining; Information analysis; Loudspeakers; Speech; Streaming media; Support vector machine classification; Support vector machines;
fLanguage
English
Publisher
ieee
Conference_Titel
Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing, 2007. SNPD 2007. Eighth ACIS International Conference on
Conference_Location
Qingdao
Print_ISBN
978-0-7695-2909-7
Type
conf
DOI
10.1109/SNPD.2007.516
Filename
4287657
Link To Document