مرکز منطقه ای اطلاع رساني علوم و فناوري - Lotus: A framework for query optimization based on distributed cache

DocumentCode :

3730519

Title :

Lotus: A framework for query optimization based on distributed cache

Author :

Chaoyong Li; Gong Cheng; Jinwen Zhong; Can Ma; Weiping Wang; Dan Meng; Qing Wang; Bo Wang

Author_Institution :

Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093

fYear :

2015

Firstpage :

1189

Lastpage :

1196

Abstract :

With the size of data increasing continuously, a huge demand for interactive query on massive datasets emerges. When processing massive structured data, the existing query engines are lacking in utilizing the query locality and catching the difference among query operators, which results in their not being applied to low-latency business scenarios. To solve these problems, this paper proposes a new framework named Lotus for query optimization based on distributed cache. Lotus adopts three strategies: (1) performing query-sensitive data distribution policy; (2) carrying out cache replacement based on statistical information; (3) optimizing the behavior of core operators. Through the above methods, Lotus improves the query performance of existing engines. The experimental study shows that Lotus can reduce the response latency and execution time of queries on large-scale structured data by more than 30% in comparison with SparkSQL or Impala.

Keywords :

"Query processing","Engines","Business","Distributed databases","Optimization","Benchmark testing","Data communication"

Publisher :

ieee

Conference_Titel :

Fuzzy Systems and Knowledge Discovery (FSKD), 2015 12th International Conference on

Type :

conf

DOI :

10.1109/FSKD.2015.7382111

Filename :

7382111

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3730519