• DocumentCode
    417781
  • Title

    Audio-cut detection and audio-segment classification using fuzzy c-means clustering

  • Author

    Nitanda, Naoki ; Haseyama, Miki ; Kitajima, Hideo

  • Author_Institution
    Sch. of Eng., Hokkaido Univ., Japan
  • Volume
    4
  • fYear
    2004
  • fDate
    17-21 May 2004
  • Abstract
    This paper proposes an audio-cut detection and audio-segment classification method using fuzzy c-means clustering. In the proposed method, the boundaries between two different audio signals, which are called audio-cuts, can be detected by fuzzy c-means clustering. In the fuzzy c-means clustering, the fuzzy number represents the possibility that the audio-cut exists. Therefore, according to the possibility, qualified candidates for audio-cuts can be obtained even if audio effects such as fade-in, fade-out, etc. are included in the audio signal. The audio signal is segmented at the detected audio-cuts, and these segments are classified into the following five classes: silence, music, speech, speech with music background, and speech with noise background. This classification simultaneously deletes wrongly detected audio-cuts. Consequently, we can obtain accurate audio-cuts and classification results.
  • Keywords
    audio signal processing; fuzzy set theory; signal classification; audio effects; audio signal boundaries; audio signal segmentation; audio-cut detection; audio-segment classification; fade-in; fade-out; fuzzy c-means clustering; music background; noise background; silence; speech; Background noise; Bandwidth; Digital audio players; Digital recording; IP networks; Material storage; Multiple signal classification; Signal processing; Speech enhancement; Transform coding;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-8484-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2004.1326829
  • Filename
    1326829