DocumentCode :
3106328
Title :
Mining Generalized Graph Patterns Based on User Examples
Author :
Dmitriev, Pavel ; Lagoze, Carl
Author_Institution :
Dept. of Comput. Sci., Cornell Univ., Ithaca, NY
fYear :
2006
fDate :
18-22 Dec. 2006
Firstpage :
857
Lastpage :
862
Abstract :
There has been a lot of recent interest in mining patterns from graphs. Often, the exact structure of the patterns of interest is not known. This happens, for example, when molecular structures are mined to discover fragments useful as features in chemical compound classification task, or when web sites are mined to discover sets of web pages representing logical documents. Such patterns are often generated from a few small subgraphs (cores), according to certain generalization rules (GRs). We call such patterns "generalized patterns "(GPs). While being structurally different, GPs often perform the same function in the network. Previously proposed approaches to mining GPs either assumed that the cores and the GRs are given, or that all interesting GPs are frequent. These are strong assumptions, which often do not hold in practical applications. In this paper, we propose an approach to mining GPs that is free from the above assumptions. Given a small number of GPs selected by the user, our algorithm discovers all GPs similar to the user examples. First, a machine learning-style approach is used to find the cores. Second, generalizations of the cores in the graph are computed to identify GPs. Evaluation on synthetic data, generated using real cores and GRs from biological and web domains, demonstrates effectiveness of our approach.
Keywords :
data mining; graph theory; learning (artificial intelligence); Web pages; Web sites; chemical compound classification task; generalization rules; generalized patterns; graph patterns mining; logical documents; machine learning; patterns structure; Biology computing; Chemical compounds; Chemical technology; Computer science; Data mining; Evolution (biology); HTML; Machine learning algorithms; Proteins; Web pages;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2006. ICDM '06. Sixth International Conference on
Conference_Location :
Hong Kong
ISSN :
1550-4786
Print_ISBN :
0-7695-2701-7
Type :
conf
DOI :
10.1109/ICDM.2006.108
Filename :
4053116
Link To Document :
بازگشت