Title :
SJMR: Parallelizing spatial join with MapReduce on clusters
Author :
Zhang, Shubin ; Han, Jizhong ; Liu, Zhiyong ; Wang, Kai ; Xu, Zhiyong
Author_Institution :
Inst. of Comput. Technol., Chinese Acad. of Sci., Beijing, China
fDate :
Aug. 31 2009-Sept. 4 2009
Abstract :
MapReduce is a widely used parallel programming model and computing platform. With MapReduce, it is very easy to develop scalable parallel programs to process data-intensive applications on clusters of commodity machines. However, it does not directly support heterogeneous related data sets processing, which is common in operations like spatial joins. This paper presents SJMR (spatial join with MapReduce), a novel parallel algorithm to relieve the problem. The strategies include strip-based plane sweeping algorithm, tile-based spatial partitioning function and duplication avoidance technology. We evalauted the performance of SJMR algorithm in various situations with the real world data sets. It demonstrates the applicability of computing-intensive spatial applications with MapReduce on small scale clusters.
Keywords :
parallel programming; pattern clustering; MapReduce program; SJMR; duplication avoidance technology; heterogeneous related data sets processing; parallel programming model; spatial join parallelization; strip-based plane sweeping algorithm; tile-based spatial partitioning function; Clustering algorithms; Computer applications; Computer science; Concurrent computing; Logic; Mathematics; Parallel processing; Parallel programming; Partitioning algorithms; Rivers;
Conference_Titel :
Cluster Computing and Workshops, 2009. CLUSTER '09. IEEE International Conference on
Conference_Location :
New Orleans, LA
Print_ISBN :
978-1-4244-5011-4
Electronic_ISBN :
1552-5244
DOI :
10.1109/CLUSTR.2009.5289178