• DocumentCode
    2704930
  • Title

    A fast audio classification from MPEG coded data

  • Author

    Nakajima, Yasuyuki ; Lu, Yang ; Sugano, Masaru ; Yoneyama, Akio ; Yamagihara, H. ; Kurematsu, Akira

  • Author_Institution
    KDD R, Saitama, Japan
  • Volume
    6
  • fYear
    1999
  • fDate
    15-19 Mar 1999
  • Firstpage
    3005
  • Abstract
    Audio information classification becomes a very important task for such purposes as automatic keyword spotting and other content-based audio-visual query systems. In this paper, we describe a fast and accurate audio data classification method on the MPEG coded data domain. Firstly silent segments are detected using a robust approach for different recording conditions. Then the non-silent segments are classified into three types, music, speech, and applause using temporal density, bandwidth and center frequency of subband energy. In order to be robust for a variety of audio sources as much as possible, we use Bayes discriminant function for multivariate Gaussian distribution instead of manually adjusting a threshold for each discriminator. In the experiment, every one-second of MPEG audio data is classified and about 90% of audio and speech segments have been successfully detected. As for the detection speed, less than 20% of MPEG audio decoding processing power is required
  • Keywords
    Bayes methods; Gaussian distribution; audio coding; signal classification; statistical analysis; Bayes discriminant function; MPEG audio data; MPEG coded data; applause; audio information classification; automatic keyword spotting; bandwidth; center frequency; content-based audio-visual query systems; fast audio classification; multivariate Gaussian distribution; music; nonsilent segments; recording conditions; silent segments; speech; subband energy; temporal density; Cepstral analysis; Classification algorithms; Data analysis; Decoding; Gunshot detection systems; Indexing; Robustness; Signal analysis; Speech; TV;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1999. Proceedings., 1999 IEEE International Conference on
  • Conference_Location
    Phoenix, AZ
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-5041-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.1999.757473
  • Filename
    757473