• DocumentCode
    714245
  • Title

    Large-scale spatial join query processing in Cloud

  • Author

    Simin You ; Jianting Zhang ; Gruenwald, Le

  • Author_Institution
    Dept. of Comput. Sci., CUNY Grad. Center, New York, NY, USA
  • fYear
    2015
  • fDate
    13-17 April 2015
  • Firstpage
    34
  • Lastpage
    41
  • Abstract
    The rapidly increasing amount of location data available in many applications has made it desirable to process their large-scale spatial queries in Cloud for performance and scalability. We report our designs and implementations of two prototype systems that are ready for Cloud deployments: SpatialSpark based on Apache Spark and ISP-MC based on Cloudera Impala. Both systems support indexed spatial joins based on point-in-polygon test and point-to-polyline distance computation. Experiments on the pickup locations of ~170 million taxi trips in New York City and ~10 million global species occurrences records have demonstrated both efficiency and scalability using Amazon EC2 clusters.
  • Keywords
    cloud computing; geographic information systems; query processing; Amazon EC2 cluster; Apache Spark; Cloudera Impala; ISP-MC; SpatialSpark; cloud computing; cloud deployment; indexed spatial join; large-scale spatial join query processing; large-scale spatial query; location data; point-in-polygon test; point-to-polyline distance computation; taxi trip; Data processing; Filtering; Hardware; Query processing; Scalability; Sparks; Spatial databases; Cloud Computing; Impala; Spark; Spatial Join;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering Workshops (ICDEW), 2015 31st IEEE International Conference on
  • Conference_Location
    Seoul
  • Type

    conf

  • DOI
    10.1109/ICDEW.2015.7129541
  • Filename
    7129541