DocumentCode :
654993
Title :
Matrix-Query: A Distributed SQL-Like Query Processing Model for Large Database Clusters
Author :
Qiao Liu ; Ping Ji ; Yuan Zuo
Author_Institution :
Dept. of Comput. Sci. & Eng., BeiHang Univ., Beijing, China
fYear :
2013
fDate :
10-12 Oct. 2013
Firstpage :
179
Lastpage :
185
Abstract :
Along with the development of distributed computation and the rapid growth of data, scientific research increasingly requires the support of high-efficiency relational data processing framework. According to the characteristics of scientific data, for example bulk inserts and unfrequented change, this paper proposes a streaming processing model called Matrix-Query with the matching data storage architecture for relational query. Through transforming the original relational schema to entities and key-value indexing, the data storage solution provides more localization operation and data positioning. Compare to traditional Map-Reduce model, the Matrix-Query isolates the influence between subtasks to ensure execution in a streaming and parallel manner and reduces negative impacts of writing intermediate file. We also optimize the data structure and subtask management to improve the performance of Matrix-Query. The experimental results demonstrate performance advantage of Matrix-query compared to two famous data processing systems, Hive and HadoopDB, which build on the top of Map-Reduce model.
Keywords :
SQL; database indexing; distributed databases; natural sciences computing; query processing; relational databases; very large databases; HadoopDB; Hive; Map-Reduce model; Matrix-Query; bulk insert; data positioning; data processing system; data storage architecture; data structure optimization; distributed SQL-like query processing model; distributed computation; high-efficiency relational data processing framework; key-value indexing; large database clusters; localization operation; relational query; relational schema; scientific data; streaming processing model; subtask management; Computational modeling; Data models; Distributed databases; Indexing; Memory; Query processing; SQL; distributed computation; relational query processing model;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), 2013 International Conference on
Conference_Location :
Beijing
Type :
conf
DOI :
10.1109/CyberC.2013.36
Filename :
6685677
Link To Document :
بازگشت