Title :
TreePi: A Novel Graph Indexing Method
Author :
Shijie Zhang ; Meng Hu ; Jiong Yang
Author_Institution :
Dept. of Electr. Eng. & Comput. Sci., Case Western Reserve Univ., Cleveland, OH, USA
Abstract :
Graphs are widely used to model complex structured data such as XML documents, protein networks, and chemical compounds. One of the fundamental problems in graph databases is efficient search and retrieval of graphs using indexing techniques. In this paper, we study the problem of indexing graph databases using frequent subtrees as indexing structures. Trees can be manipulated efficiently while preserving a lot of structural information of the original graphs. In our proposed method, frequent subtrees of a database are selected as the feature set. To save memory, the set of feature trees is shrunk based on a support threshold function and their discriminative power. A tree-partition based query processing scheme is proposed to perform graph queries. The concept of center distance constraints is introduced to prune the search space. Furthermore, a new algorithm which utilizes the location information of indexing structures is used to perform subgraph isomorphism tests. We apply our method on a wide range of real and synthetic data to demonstrate the usefulness and effectiveness of this approach.
Keywords :
XML; database indexing; query processing; TreePi; XML documents; center distance constraints; chemical compounds; complex structured data; frequent subtrees; graph databases; graph indexing; graph queries; protein networks; subgraph isomorphism tests; tree-partition-based query processing; Chemical compounds; Indexing; Information retrieval; Performance evaluation; Proteins; Query processing; Spatial databases; Testing; Tree graphs; XML;
Conference_Titel :
Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on
Conference_Location :
Istanbul
Print_ISBN :
1-4244-0802-4
DOI :
10.1109/ICDE.2007.368955