Title :
Outliers influence to the point distance distribution normality within the data clusters
Author :
Malkic, J. ; Sarajlic, N. ; Hadzic, D.
Author_Institution :
Fac. of Electr. Eng., Univ. of Banjaluka, Banja Luka, Bosnia-Herzegovina
Abstract :
In order to verify the cluster analysis results, a normality test is being applied to the distribution of data point´s distances from their cluster center. The presence of the outlier points within the input data can however influence this method in a negative way. Therefore, a normality test will show better results in recognizing and assessing the clusters if the outlier presence is reduced. This fact is being confirmed by empirically comparing the normality test results for the clusters produced by different cluster analyses methods on the same data set.
Keywords :
data mining; pattern clustering; cluster analysis results; cluster assessment; cluster center; cluster recognition; data clusters; data mining; data point distance distribution; normality test; outlier points; Algorithm design and analysis; Clustering algorithms; Gaussian distribution; Histograms; Shape; Standards; Vectors; Cluster verification; data mining; distance distribution normality;
Conference_Titel :
Telecommunications Forum (TELFOR), 2012 20th
Conference_Location :
Belgrade
Print_ISBN :
978-1-4673-2983-5
DOI :
10.1109/TELFOR.2012.6419542