• DocumentCode
    417786
  • Title

    Audio segmentation based on multi-scale audio classification

  • Author

    Zhang, Yibin ; Zhou, Jie

  • Author_Institution
    Dept. of Autom., Tsinghua Univ., Beijing, China
  • Volume
    4
  • fYear
    2004
  • fDate
    17-21 May 2004
  • Abstract
    Content-based audio segmentation plays an important role in multimedia applications. In order to segment accurately and on-line, most conventional algorithms are based on small-scale feature classification and always result in a high false alarm rate. Our experimental results show that large-scale audio can be more easily classified than small ones. According to this fact, we present a novel multi-scale framework for audio segmentation. First, a rough segmentation step based on large-scale classification is taken to ensure the integrality of the content of segments, which can avoid the consecutive audio belonging to the same kind being segmented into different pieces. Then a subtle segmentation step is taken to further locate the segmentation points for the boundary areas computed by the rough segmentation step. Experimental results show that a low false alarm rate can be achieved while preserving a low missing rate.
  • Keywords
    audio signal processing; multimedia communication; signal classification; content-based audio segmentation; false alarm rate; large-scale classification; missing rate; multi-scale audio classification; multimedia applications; Automation; Feature extraction; Frequency; Information analysis; Large-scale systems; Music; Speech; Streaming media; TV broadcasting; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-8484-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2004.1326835
  • Filename
    1326835