• DocumentCode
    699233
  • Title

    Audio spectrum projection based on several basis decomposition algorithms applied to general sound recognition and audio segmentation

  • Author

    Hyoung-Gook Kim ; Sikora, Thomas

  • Author_Institution
    Commun. Syst. Group, Tech. Univ. of Berlin, Berlin, Germany
  • fYear
    2004
  • fDate
    6-10 Sept. 2004
  • Firstpage
    1047
  • Lastpage
    1050
  • Abstract
    Our challenge is to analyze/classify video sound track content for indexing purposes. To this end we compare the performance of MPEG-7 Audio Spectrum Projection (ASP) features based on basis decomposition vs. Mel-scale Frequency Cepstrum Coefficients (MFCC). For basis decomposition in the feature extraction we have three choices: Principal Component Analysis (PCA), Independent Component Analysis (ICA), and Non-negative Matrix Factorization (NMF). Audio features are computed from these reduced vectors and are fed into hidden Markov model (HMM) classifier. Experimental results show that the MFCC features yield better performance compared to MPEG-7 ASP in the sound recognition, and audio segmentation.
  • Keywords
    feature extraction; hidden Markov models; independent component analysis; matrix decomposition; principal component analysis; video signal processing; HMM classifier; ICA; MPEG-7 audio spectrum projection features; Mel-scale frequency cepstrum coefficients; NMF; PCA; audio segmentation; basis decomposition algorithms; feature extraction; general sound recognition; hidden Markov model classifier; independent component analysis; indexing purposes; nonnegative matrix factorization; principal component analysis; Abstracts; Classification algorithms; Hidden Markov models; Mel frequency cepstral coefficient; Principal component analysis; Speech; Training;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing Conference, 2004 12th European
  • Conference_Location
    Vienna
  • Print_ISBN
    978-320-0001-65-7
  • Type

    conf

  • Filename
    7079763