Title :
Star-Structured High-Order Heterogeneous Data Co-clustering Based on Consistent Information Theory
Author :
Gao, Bin ; Liu, Tie-Yan ; Ma, Wei-Ying
Author_Institution :
Microsoft Res. Asia, Beijing
Abstract :
Heterogeneous object co-clustering has become an important research topic in data mining. In early years of this research, people mainly worked on two types of heterogeneous data (denoted by pair-wise co-clustering); while recently more and more attention was paid to multiple types of heterogeneous data (denoted by high- order co-clustering). In this paper, we studied the high- order co-clustering of objects with star-structured interrelationship, i.e., there is a central type of objects that connects the other types of objects. Actually, this case could be a very good model for many real-world applications, such as the co-clustering of Web images, their low-level visual features, and the surrounding text. We used a tripartite graph to represent the interrelationships among different objects, and proposed a consistent information theory which generates an effective algorithm to obtain the co-clusters of different types of objects. Experiments on a Web image show that our proposed algorithm is a better choice compared with previous work on heterogeneous object co-clustering.
Keywords :
data mining; graph theory; pattern clustering; consistent information theory; data mining; star-structured high-order heterogeneous data co-clustering; tripartite graph; Asia; Clustering algorithms; Constraint optimization; Constraint theory; Data mining; Information theory; Mutual information; Probability distribution; Random variables; Search engines;
Conference_Titel :
Data Mining, 2006. ICDM '06. Sixth International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
0-7695-2701-7
DOI :
10.1109/ICDM.2006.154