DocumentCode :
172954
Title :
GISQF: An Efficient Spatial Query Processing System
Author :
Al-Naami, Khaled Mohammed ; Seker, Sadi ; Khan, Latifur
Author_Institution :
Dept. of Comput. Sci., Univ. of Texas at Dallas, Dallas, TX, USA
fYear :
2014
fDate :
June 27 2014-July 2 2014
Firstpage :
681
Lastpage :
688
Abstract :
Collecting observations from all international news coverage and using TABARI software to code events, the Global Database of Event, Language, and Tone (GDELT) is the only global political georeferenced event dataset with 250+ million observations covering all countries in the world from January 1, 1979 to the present with daily updates. The purpose of this widely used dataset is to help understand and uncover spatial, temporal and perceptual trends and behaviors of the social and international system. To query such big geospatial data, traditional RDBMS can no longer be used and the need for parallel distributed solutions has become a necessity. MapReduce paradigm has proved to be a scalable platform to process and analyze Big Data in the cloud. Hadoop as an implementation of MapReduce is an open source application that has been widely used and accepted in academia and industry. However, when dealing with Spatial Data, Hadoop is not equipped well and falls short as it doesn´t perform efficiently in terms of running time. SpatialHadoop is an extension of Hadoop with the support of spatial data. In this paper, we present Geographic Information System Querying Framework (GISQF) to process Massive Spatial Data. This framework has been built on top of the open source SpatialHadoop system which exploits two-layer spatial indexing techniques to speed up query processing. We show how this solution outperforms Hadoop query processing by orders of magnitude when applying queries on GDELT dataset with a size of 60 GB. We show the results for three types of queries, Longitude-Latitude Point queries, Circle-Area queries, and Aggregation queries.
Keywords :
Big Data; cloud computing; data analysis; geographic information systems; public domain software; query processing; Big Data analysis; GDELT; GISQF; Global Database of Event, Language, and Tone; Hadoop query processing; MapReduce paradigm; SpatialHadoop system; TABARI software; aggregation queries; circle-area queries; cloud computing; geographic information system querying framework; geospatial data; global political georeferenced event dataset; longitude-latitude point queries; open source application; spatial query processing system; two-layer spatial indexing techniques; Indexing; Partitioning algorithms; Query processing; Shape; Software; Spatial databases;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cloud Computing (CLOUD), 2014 IEEE 7th International Conference on
Conference_Location :
Anchorage, AK
Print_ISBN :
978-1-4799-5062-1
Type :
conf
DOI :
10.1109/CLOUD.2014.96
Filename :
6973802
Link To Document :
بازگشت