Title :
Audio-cut detection and audio-segment classification using fuzzy c-means clustering
Author :
Nitanda, Naoki ; Haseyama, Miki ; Kitajima, Hideo
Author_Institution :
Sch. of Eng., Hokkaido Univ., Japan
Abstract :
This paper proposes an audio-cut detection and audio-segment classification method using fuzzy c-means clustering. In the proposed method, the boundaries between two different audio signals, which are called audio-cuts, can be detected by fuzzy c-means clustering. In the fuzzy c-means clustering, the fuzzy number represents the possibility that the audio-cut exists. Therefore, according to the possibility, qualified candidates for audio-cuts can be obtained even if audio effects such as fade-in, fade-out, etc. are included in the audio signal. The audio signal is segmented at the detected audio-cuts, and these segments are classified into the following five classes: silence, music, speech, speech with music background, and speech with noise background. This classification simultaneously deletes wrongly detected audio-cuts. Consequently, we can obtain accurate audio-cuts and classification results.
Keywords :
audio signal processing; fuzzy set theory; signal classification; audio effects; audio signal boundaries; audio signal segmentation; audio-cut detection; audio-segment classification; fade-in; fade-out; fuzzy c-means clustering; music background; noise background; silence; speech; Background noise; Bandwidth; Digital audio players; Digital recording; IP networks; Material storage; Multiple signal classification; Signal processing; Speech enhancement; Transform coding;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
Print_ISBN :
0-7803-8484-9
DOI :
10.1109/ICASSP.2004.1326829