Title :
Clustering of Complex Data-Sets Using Fractal Similarity Measures and Uncertainties
Author :
Maximilian Hoecker;Kai Lars Polsterer; K?gler;Vincent Heuveline
Author_Institution :
Heidelberg Univ., Heidelberg, Germany
Abstract :
The unsupervised analysis of data-sets, both large in dimension as well as in number of objects, are one of the most challenging tasks in data intense sciences. Especially in astronomy, dedicated survey telescopes generate an enormous amount of complex data. For example the database of the Sloan Digital Sky Survey (SDSS DR10) contains 3 million spectra with ca. 5,000 values each. Analyzing those spectra is computationally demanding when applying standard techniques and standard similarity measures. In addition to the big data aspects one has to deal with the uncertainties of the measurements. We present a generic and noise tolerant similarity measure which is based on box counting methods and comparable to calculating fractal dimensions. Besides the theoretical aspects of the proposed method, the implementation details as well as the achieved evaluation results are discussed in this paper. Our implementation exploits current affordable computing architectures with large memory resources. The Fractal Similarity Measure enables scientists to perform clustering, classification and outlier detection in nowadays databases. Event though this is a generic method, the experiments shown in this paper demonstrate the performance just for clustering.
Keywords :
"Fractals","Extraterrestrial measurements","Clustering algorithms","Mathematical model","Shape","Uncertainty","Astronomy"
Conference_Titel :
Computational Science and Engineering (CSE), 2015 IEEE 18th International Conference on
DOI :
10.1109/CSE.2015.35