• DocumentCode
    3225349
  • Title

    Automatic Audio Classification and Speaker Identification for Video Content Analysis

  • Author

    Liu, Shu-Chang ; Bi, Jing ; Jia, Zhi-Qiang ; Chen, Rui ; Chen, Jie ; Zhou, Min-Min

  • Author_Institution
    Beijing Univ. of Posts & Telecommun., Beijing
  • Volume
    2
  • fYear
    2007
  • fDate
    July 30 2007-Aug. 1 2007
  • Firstpage
    91
  • Lastpage
    96
  • Abstract
    Recently, more literatures proposed to apply audio content analysis techniques in content-based video parsing. This paper presents our works on audio classification and speaker identification techniques for video content analysis. Firstly, soundtrack extracted from video stream is partitioned into homogeneous segments using rule and support vector machine(SVM) based classifier. Secondly, fixed-length speech clips randomly selected from speech segments are clustered into several clusters based on spectral clustering techniques. The clustered speech feature datasets initialize and train Gaussian mixture model(GMM) for each speaker. Finally, the trained GMMs accomplish speaker identification. Experimental results confirm the validity of the proposed scheme.
  • Keywords
    Gaussian processes; speaker recognition; support vector machines; video signal processing; Gaussian mixture model; audio content analysis techniques; automatic audio classification; content-based video parsing; fixed-length speech clips; homogeneous segments; speaker identification techniques; spectral clustering techniques; speech feature datasets; support vector machine; video content analysis; video stream; Artificial intelligence; Cepstral analysis; Content based retrieval; Data mining; Information analysis; Loudspeakers; Speech; Streaming media; Support vector machine classification; Support vector machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing, 2007. SNPD 2007. Eighth ACIS International Conference on
  • Conference_Location
    Qingdao
  • Print_ISBN
    978-0-7695-2909-7
  • Type

    conf

  • DOI
    10.1109/SNPD.2007.516
  • Filename
    4287657