DocumentCode :
2030071
Title :
Improve K-means clustering for audio data by exploring a reasonable sampling rate
Author :
Chen, Gang ; Han, Bo
Author_Institution :
Int. Sch. of Software, Wuhan Univ., Wuhan, China
Volume :
4
fYear :
2010
fDate :
10-12 Aug. 2010
Firstpage :
1639
Lastpage :
1642
Abstract :
K-means clustering is sensitive to starting points and its time cost is expensive for large scale of data, such as audio. Sampling approach is widely applied to find “better” starting points for speeding up the clustering converging procedure. However, how to choose a reasonable sampling-rate remains a problem. In this paper, we reported our initial exploration of locating reasonable sampling-rates for different datasets. The procedure progressively increases sampling-rates and choose the cluster centers in the previous stage as the starting points for next clustering. The resulted relationship curve between sampling-rate and iteration number illustrates a turning point as reasonable sampling-rate. Based on two audio experimental data, the procedure can more efficiently cluster data while keeping similar clustering quality.
Keywords :
data mining; pattern clustering; K-means clustering; audio data; data clustering; reasonable sampling-rate; Algorithm design and analysis; Clustering algorithms; Data mining; Presses; Shape; Software; Software algorithms; K-means; audio; clustering; sampling-rate; starting points;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2010 Seventh International Conference on
Conference_Location :
Yantai, Shandong
Print_ISBN :
978-1-4244-5931-5
Type :
conf
DOI :
10.1109/FSKD.2010.5569371
Filename :
5569371
Link To Document :
بازگشت