DocumentCode :
3569624
Title :
DPM: Fast and scalable clustering algorithm for large scale high dimensional datasets
Author :
Ghanem, Tamer F. ; Elkilani, Wail S. ; Ahmed, Hatem S. ; Hadhoud, Mohiy M.
Author_Institution :
Inf. Technol. Dept., Menofiya Univ., Shebin-El-Kom, Egypt
fYear :
2014
Firstpage :
26
Lastpage :
35
Abstract :
Clustering multi-dense large scale high dimensional datasets are a challenging task duo to scalability limits of most of clustering algorithms. Nowadays, data collection tools produce large amounts of data. So, fast and scalable algorithms are vital requirement for clustering such data. In this paper, a fast and scalable algorithm called dimension-based partitioning and merging clustering (DPM) is proposed. In DPM, data is partitioned into small dense volumes while processing each dimension values range. Next, noise are filtered out using dimensional densities of the generated partitions. At last, merging process in invoked to construct clusters based on partitions boundary data samples. DPM algorithm detects automatically the number of data clusters based on three insensitive tuning parameters which decrease the burden of its usage. Performance evaluation on different datasets proves the extreme fastness and scalability of the proposed algorithm along with clustering accuracy compared to other large scale clustering competitors.
Keywords :
data acquisition; filtering theory; merging; pattern clustering; DPM algorithm; clustering algorithms; data collection tools; data partitioning; dimension-based partitioning and merging clustering; dimensional densities; merging process; multidense large scale high dimensional datasets clustering; noise filtering; partitions boundary data samples; scalability limits; tuning parameters; Clustering algorithms; Filtering algorithms; Wavelet transforms; Clustering; density-based clustering; subspace clustering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Engineering Conference (ICENCO), 2014 10th International
Print_ISBN :
978-1-4799-5240-3
Type :
conf
DOI :
10.1109/ICENCO.2014.7050427
Filename :
7050427
Link To Document :
بازگشت