Author :
Hsu, Ya-Ting ; Pan, Yi-Chin ; Wei, Ling-Yin ; Peng, Wen-Chih ; Lee, Wang-Chien
Author_Institution :
Dept. of Comput. Sci., Nat. Chiao Tung Univ., Hsinchu, Taiwan
Abstract :
Due to the flexibility and scalability in cloud computing, cloud computing nowadays plays an important role to handle a large-scale data analysis. For data processing operations, several cloud data managements (CDMs), such as HBase and Cassandra, are developed. Such CDMs usually provide key-value storages, where each key is used to access its corresponding value. Both HBase and Cassandra provide some basic operations (e.g., Get, Scan) to retrieve the values via keys specified by users. The exiting CDMs fully inherit the characteristics of cloud computing (i.e., high scalability and availability). With the aforementioned characteristics of cloud computing, CDMs are widely employed for Web data, especially for search engines. However, with the proliferation of smart phones and location-based services, data with spatial information, referring as spatial data, are dramatically increasing. Consequently, how to formulate keys for spatial data in the existing CDMs is a challenge issue. In this paper, we develop several key formulation schemes. In particular, we propose a novel Key formulation scheme based on R+-tree (abbreviated as KR+-index). With our design for keys of spatial data, the existing CDMs are able to efficiently retrieve spatial data. In light of KR+-tree, two spatial queries, k-NN query and range query, are designed. Moreover, we implement the proposed key formulation schemes on HBase and Cassandra, and import real spatial data for spatial queries. The experimental results demonstrate that KR+-tree outperforms other existing key formulations and MD-HBase.
Keywords :
cloud computing; database management systems; indexing; information retrieval; Cassandra; HBase; Web data; cloud computing scalability; cloud data management; data processing operation; k-NN query; key formulation scheme; key value storage; large scale data analysis; location based services; range query; search engines; smart phones; spatial data retrieval; spatial index; spatial information; spatial queries; Cloud computing; Global Positioning System; Indexing; Scalability; Spatial databases;