Title :
Querying Distributed Spatial Datasets with Unknown Regions
Author :
Qijun Zhu ; Dik Lun Lee ; Wang-Chien Lee
Author_Institution :
Dept. of Comput. Sci. & Eng., Hong Kong Univ. of Sci. & Technol., Hong Kong, China
Abstract :
This paper studies the problem of querying Bounded Spatial Datasets (BSDs). A BSD contains i) objects with known locations, and ii) unknown regions, each of which bounds an unknown number of objects, within a coverage area. We consider applications where each BSD is hosted on a server or site connected to a communication network and the BSDs overlap in their coverage areas. The challenge is to query the distributed BSDs to retrieve all objects and to minimize the unknown regions which may contain objects satisfying the query, while minimizing the data transmission volume and number of interactions between the query client and the sites. We develop query processing algorithms for two important types of spatial queries, namely, range and k-nearest-neighbor (kNN) queries. We develop the site-based approach and the area-based approach for efficiently processing range and kNN queries on distributed BSDs. They aim to process only a subset of the sites to obtain the full answer for a query. Thus, optimal site selection and the corresponding site querying methods are important problems studied in this paper. In the area-based approach, we prove an optimal division and derive a practical heuristic to partition a query and select the best processing site for each partition, hence achieving even better efficiency than the site-based approach. Simulation results based on three real spatial datasets show that our proposed approaches significantly outperform the baseline that uses a centralized approach in terms of data transmission volume and the number of interactions between the query client and the distributed sites.
Keywords :
query processing; visual databases; BSDs; area-based approach; bounded spatial dataset querying; communication network; data transmission volume minimization; distributed spatial dataset querying; k-nearest-neighbor querying; kNN querying; object retrieval; optimal site selection; query processing algorithms; range querying; site querying methods; site-based approach; spatial querying; Data communication; Distributed databases; Mobile communication; Probabilistic logic; Query processing; Semantics; Spatial databases; Distributed databases; Distributed query processing; Mobile Computing; Spatial databases; Spatial databases and GIS; cost-efficient query processing; data integration; incomplete datasets; kNN queries; range queries; spatial databases;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
DOI :
10.1109/TKDE.2013.169