DocumentCode :
710105
Title :
Efficient and scalable trie-based algorithms for computing set containment relations
Author :
Yongming Luo ; Fletcher, George H. L. ; Hidders, Jan ; De Bra, Paul
Author_Institution :
Eindhoven Univ. of Technol., Eindhoven, Netherlands
fYear :
2015
fDate :
13-17 April 2015
Firstpage :
303
Lastpage :
314
Abstract :
Computing containment relations between massive collections of sets is a fundamental operation in data management, for example in graph analytics and data mining applications. Motivated by recent hardware trends, in this paper we present two novel solutions for computing set-containment joins over massive sets: the Patricia Trie-based Signature Join (PTSJ) and PRETTI+, a Patricia trie enhanced extension of the state-of-the-art PRETTI join. The compact trie structure not only enables efficient use of main-memory, but also significantly boosts the performance of both approaches. By carefully analyzing the algorithms and conducting extensive experiments with various synthetic and real-world datasets, we show that, in many practical cases, our algorithms are an order of magnitude faster than the state-of-the-art.
Keywords :
data analysis; data structures; information retrieval; storage management; PRETTI join; PRETTI+; PTSJ; Patricia trie-based signature join; computing set containment relation; data management; data mining; graph analytics; scalable trie-based algorithm; Algorithm design and analysis; Data mining; Data structures; Estimation; Hardware; Indexes; Market research;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering (ICDE), 2015 IEEE 31st International Conference on
Conference_Location :
Seoul
Type :
conf
DOI :
10.1109/ICDE.2015.7113293
Filename :
7113293
Link To Document :
بازگشت