Title :
Means for finding meaningful levels of a hierarchical sequence prior to performing a cluster analysis
Author :
Olsen, David Allen
Author_Institution :
Department of Computer Science and Engineering, University of Minnesota-Twin Cities, Minneapolis, U.S.A.
Abstract :
When the assumptions underlying the standard complete linkage method are unwound, the size of a hierarchical sequence reverts back from n levels to n·(n−1) over 2 +1 levels, and the time complexity to construct a hierarchical sequence of cluster sets becomes O(n4). Moreover, the post hoc heuristics for cutting dendrograms are not suitable for finding meaningful cluster sets of an n·(n−1) over 2 +1-level hierarchical sequence. To overcome these problems for small-n, large-m data sets, the project described in this paper went back more than 60 years to solve a problem that could not be solved then. This paper presents a means for finding meaningful levels of an n·(n−1) over 2 +1-level hierarchical sequence prior to performing a cluster analysis. By finding meaningful levels of such a hierarchical sequence prior to performing a cluster analysis, it is possible to know which cluster sets to construct and construct only these cluster sets. This paper also shows how increasing the dimensionality of the data points helps reveal inherent structure in noisy data. The means is theoretically validated. Empirical results from four experiments show that finding meaningful levels of a hierarchical sequence is easy and that meaningful cluster sets can have real world meaning.
Keywords :
Clustering methods; Couplings; Noise; Noise measurement; Sensors; Standards; Time complexity; Complete Linkage; Distance Graphs; Hierarchical Clustering; Hierarchical Sequence; Intelligent Control Systems; Meaningful Cluster Set; Meaningful Level; Noise Attenuation;
Conference_Titel :
Informatics in Control, Automation and Robotics (ICINCO), 2014 11th International Conference on