Title :
GString: A Novel Approach for Efficient Search in Graph Databases
Author :
Haoliang Jiang ; Haixun Wang ; Yu, Philip S. ; Shuigeng Zhou
Author_Institution :
Dept. of Comput. Sci. & Eng., Fudan Univ., Shanghai, China
Abstract :
Graphs are widely used for modeling complicated data, including chemical compounds, protein interactions, XML documents, and multimedia. Information retrieval against such data can be formulated as a graph search problem, and finding an efficient solution to the problem is essential for many applications. A popular approach is to represent both graphs and queries on graphs by sequences, thus converting graph search to subsequence matching. State-of-the-art sequencing methods work at the finest granularity - each node (or edge) in the graph will appear as an element in the resulting sequence. Clearly, such methods are not semantic conscious, and the resulting sequences are not only bulky but also prone to complexities arising from graph isomorphism and other problems in searching. In this paper, we introduce a novel sequencing method to capture the semantics of the underlying graph data. We find meaningful components in graph structures and use them as the most basic units in sequencing. It not only reduces the size of resulting sequences, but also enables semantic-based searching. In this paper, we base our approach on chemical compound databases, although it can be applied to searching other complicated graphs, such as protein structures. Experiments demonstrate that our approach outperforms state-of-the-art graph search methods.
Keywords :
graph theory; query processing; search problems; GString; graph database search; graph search problem; graph structures; information retrieval; subsequence matching; Chemical compounds; Explosions; Filtering; Information retrieval; Multimedia databases; Proteins; Search methods; Search problems; Spatial databases; XML;
Conference_Titel :
Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on
Conference_Location :
Istanbul
Print_ISBN :
1-4244-0802-4
DOI :
10.1109/ICDE.2007.367902