DocumentCode
2710868
Title
Frequent Subgraph Retrieval in Geometric Graph Databases
Author
Nowozin, Sebastian ; Tsuda, Koji
Author_Institution
Max Planck Inst. for Biol. Cybern., Tubingen
fYear
2008
fDate
15-19 Dec. 2008
Firstpage
953
Lastpage
958
Abstract
Discovery of knowledge from geometric graph databases is of particular importance in chemistry and biology, because chemical compounds and proteins are represented as graphs with 3D geometric coordinates. In such applications, scientists are not interested in the statistics of the whole database. Instead they need information about a novel drug candidate or protein at hand, represented as a query graph. We propose a polynomial-delay algorithm for geometric frequent subgraph retrieval. It enumerates all subgraphs of a single given query graph which are frequent geometric epsi-subgraphs under the entire class of rigid geometric transformations in a database. By using geometric epsi-subgraphs, we achieve tolerance against variations in geometry. We compare the proposed algorithm to gSpan on chemical compound data, and we show that for a given minimum support the total number of frequent patterns is substantially limited by requiring geometric matching. Although the computation time per pattern is larger than for non-geometric graph mining, the total time is within a reasonable level even for small minimum support.
Keywords
data mining; database management systems; geometry; graph theory; query processing; 3D geometric coordinates; chemical compound data; geometric frequent subgraph retrieval; geometric graph databases; knowledge discovery; query graph; Chemical compounds; Chemistry; Drugs; Geometry; Information retrieval; Pattern matching; Polynomials; Proteins; Spatial databases; Statistics; frequent graph mining; geometric graph mining; geometric patterns;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining, 2008. ICDM '08. Eighth IEEE International Conference on
Conference_Location
Pisa
ISSN
1550-4786
Print_ISBN
978-0-7695-3502-9
Type
conf
DOI
10.1109/ICDM.2008.38
Filename
4781207
Link To Document