Title :
A distributed approach for top-k star queries on massive information networks
Author :
Jiahui Jin ; Khemmarat, Samamon ; Lixin Gao ; Junzhou Luo
Author_Institution :
Sch. of Comput. Sci. & Eng., Southeast Univ., Nanjing, China
Abstract :
Massive information networks, such as the knowledge graph by Google, contain billions of labeled entities. Star queries, which aim to identify an entity, given a set of related entities, are common on such networks. Answering star queries can be modeled as a graph pattern matching problem. Traditional approaches apply graph indices to accelerate the query processing. Unfortunately, it is so costly that it is nearly infeasible to build indices on billion node graphs since the time or storage complexity of most indexing techniques is super-linear to the graph size. In this paper, we propose an algorithm to identify the top-k best answers for a star query. Instead of using expensive indices, our algorithm utilizes novel bounding techniques to derive the top-k best answers efficiently. Further, the algorithm can be implemented in a distributed manner scaling to billions of entities and hundreds of machines. We demonstrate the effectiveness and the efficiency of our approach through a series of experiments on real-world information networks.
Keywords :
distributed algorithms; information networks; query processing; Google; bounding technique; distributed approach; indexing techniques; massive information networks; query processing; star query answering; top-k star query; Acceleration; Clustering algorithms; Indexing; Motion pictures; Pattern matching; Query processing; Upper bound; Big Data; Billion-Node Graph; Distributed System; Information Network; Star Query; Top-k Query;
Conference_Titel :
Parallel and Distributed Systems (ICPADS), 2014 20th IEEE International Conference on
DOI :
10.1109/PADSW.2014.7097785