مرکز منطقه ای اطلاع رساني علوم و فناوري - Matrix-Query: A Distributed SQL-Like Query Processing Model for Large Database Clusters

DocumentCode :

654993

Title :

Matrix-Query: A Distributed SQL-Like Query Processing Model for Large Database Clusters

Author :

Qiao Liu ; Ping Ji ; Yuan Zuo

Author_Institution :

Dept. of Comput. Sci. & Eng., BeiHang Univ., Beijing, China

fYear :

2013

fDate :

10-12 Oct. 2013

Firstpage :

179

Lastpage :

185

Abstract :

Along with the development of distributed computation and the rapid growth of data, scientific research increasingly requires the support of high-efficiency relational data processing framework. According to the characteristics of scientific data, for example bulk inserts and unfrequented change, this paper proposes a streaming processing model called Matrix-Query with the matching data storage architecture for relational query. Through transforming the original relational schema to entities and key-value indexing, the data storage solution provides more localization operation and data positioning. Compare to traditional Map-Reduce model, the Matrix-Query isolates the influence between subtasks to ensure execution in a streaming and parallel manner and reduces negative impacts of writing intermediate file. We also optimize the data structure and subtask management to improve the performance of Matrix-Query. The experimental results demonstrate performance advantage of Matrix-query compared to two famous data processing systems, Hive and HadoopDB, which build on the top of Map-Reduce model.

Keywords :

SQL; database indexing; distributed databases; natural sciences computing; query processing; relational databases; very large databases; HadoopDB; Hive; Map-Reduce model; Matrix-Query; bulk insert; data positioning; data processing system; data storage architecture; data structure optimization; distributed SQL-like query processing model; distributed computation; high-efficiency relational data processing framework; key-value indexing; large database clusters; localization operation; relational query; relational schema; scientific data; streaming processing model; subtask management; Computational modeling; Data models; Distributed databases; Indexing; Memory; Query processing; SQL; distributed computation; relational query processing model;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), 2013 International Conference on

Conference_Location :

Beijing

Type :

conf

DOI :

10.1109/CyberC.2013.36

Filename :

6685677

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=654993