Title :
Assessment of clustering tendency through progressive random sampling and graph-based clustering results
Author :
Prasad, K.R. ; Reddy, B. Eswara
Author_Institution :
JNTUA Coll. of Eng., Anantapur, India
Abstract :
Clustering analysis is widely used technique in many emerging applications. Assessment of clustering tendency is generally done by Visual Access Tendency (VAT) algorithm. VAT detects the clustering tendency by reordering the indices of objects from the dissimilarity matrix, according to logic of Prim´s algorithm. Therefore, VAT demands high computational cost for large datasets. The contribution of proposed work is to develop best sampling technique for obtaining good representative of entire dataset in the form of sub-dissimilarity matrix in VAT, it provides accessing of prior tendency visually by detecting number of square shaped dark blocks along with diagonal in sample based VAT image. This proposed work gives same clustering tendency results when we compare with simple VAT, and it has an advantage of less processing time since it uses only sampled dissimilarity matrix. This sample based VAT (PSVAT) uses set of distinguished features for random selection of progressive sample representatives. Finally, known clustering tendency is used in graph-based clustering technique (Minimum Spanning Tree based clustering) for achieving efficient clustering results. Comparative runtime values of PSVAT and VAT on several datasets are presented in this paper for showing that PSVAT is better than VAT in respect of runtime performance and clustering validity is also tested by Dunn´s Index for sampled data.
Keywords :
data mining; matrix algebra; pattern clustering; random processes; sampling methods; trees (mathematics); Dunn Index; PSVAT; VAT algorithm; clustering analysis; clustering tendency assessment; clustering tendency detection; clustering validity; dissimilarity matrix; graph-based clustering technique; minimum spanning tree-based clustering; progressive random sampling; random selection; runtime performance; sample-based VAT image; sampling technique; square shaped dark block detection; subdissimilarity matrix; visual access tendency algorithm; Algorithm design and analysis; Clustering algorithms; Data mining; Histograms; Indexes; Runtime; Visualization; Clustering; Dunn´s Index; MST based clustering; Sampling; VAT;
Conference_Titel :
Advance Computing Conference (IACC), 2013 IEEE 3rd International
Conference_Location :
Ghaziabad
Print_ISBN :
978-1-4673-4527-9
DOI :
10.1109/IAdCC.2013.6514316