Title : 
A mathematical model of similarity and clustering
         
        
            Author : 
Sun, Fu-Shing ; Tzeng, Chun-Hung
         
        
            Author_Institution : 
Dept. of Comput. Sci., Ball State Univ., Muncie, IN, USA
         
        
        
        
        
        
            Abstract : 
This paper introduces an abstract model of data similarity and clustering. A similarity on a space Ω is formulated explicitly by a reflexive and symmetric binary relation, called a tolerance relation, for which we introduce three types of coverings of Ω. Given a covering U, a clustering is defined to be minimal sub-covering. To search for an optimal clustering is to minimize the number of clusters, which is intractable in general. This paper proposes a heuristic method to search for sub-optimal clusterings for a given tolerance relation.
         
        
            Keywords : 
data mining; data structures; pattern clustering; search problems; data clustering; data similarity; heuristic method; minimal subcovering; optimal clustering; reflexive binary relation; symmetric binary relation; tolerance relation; Clustering algorithms; Computer science; Data mining; Electronic mail; Euclidean distance; Fractals; Lattices; Mathematical model; Sun; Topology;
         
        
        
        
            Conference_Titel : 
Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004. International Conference on
         
        
            Print_ISBN : 
0-7695-2108-8
         
        
        
            DOI : 
10.1109/ITCC.2004.1286499