Title :
Clustering-Variable-Width Histogram Based Window Semi-hash Multi-join over Streams
Author :
Zhang, Xiaojian ; Jiang, Wanchang ; Zhang, Yadong ; Huo, Cong
Author_Institution :
Henan Univ. of Finance & Econ., Zhengzhou
Abstract :
Join operator has become more and more important in the context of data stream. Most join algorithms over streams to date are based on nested loop joins or hash joins. However, these time expensive algorithms can not suit for CPU-limited case. In this paper, a clustering-based variable-width histogram is designed for obtaining value distribution of tuples in the sliding windows, and some important characteristics of tuples can be retained. Semi-hash tables can be constructed by using the histogram. Our sliding window semi-hash multi-join algorithm can minimize the processing time cost of join and produce an accurate join result much earlier. Experimental results show that our approach is more efficient than other approaches.
Keywords :
data analysis; file organisation; query processing; clustering-variable-width histogram; data stream; nested loop joins; processing time cost; time expensive algorithms; window semi-hash multi-join; Clustering algorithms; Computer science; Costs; Data analysis; Data engineering; Educational institutions; Finance; Histograms; Information science; Information technology;
Conference_Titel :
Convergence Information Technology, 2007. International Conference on
Conference_Location :
Gyeongju
Print_ISBN :
0-7695-3038-9
DOI :
10.1109/ICCIT.2007.246