DocumentCode :
650607
Title :
Multi-query Unification for Generating Efficient Big Data Processing Components from a DFD
Author :
Kimura, K. ; Nomura, Yutaka ; Kurihara, Hiroshi ; Yamamoto, Koji ; Yamamoto, Ryo
Author_Institution :
Software Innovation Lab., FUJITSU Labs. Ltd., Kawasaki, Japan
fYear :
2013
fDate :
June 28 2013-July 3 2013
Firstpage :
260
Lastpage :
268
Abstract :
This paper proposes multi-query unification, a technique for generating unified components from a DFD aimed at reducing the total cost of data transmission between components that are deployed to a computing fabric that includes processing nodes and interconnection services. The method focuses on generating components of the two primary data processing methodologies: cumulative data processing (CDP) and data stream processing (DSP). The method utilizes multi-query unification and generates a unified query by applying two methods depending on the order sensitivity of processes in a DFD. Nesting unification composes a unified query by embedding the query of a process into the query of the next process as a subquery. Clause assembly unification composes a query using templates for each clause of the original query. For clause assembly is applicable only to processes that is executable simultaneously, we define the criteria called order sensitivity for applying clause assembly and propose two-stage unification in which nesting unification is always applied after clause assembly. The performance evaluation based on a virtual DFD shows that applying two-stage unification reduces the execution time of components by 60 percent in DSP, however, execution time is reduced by only 10 percent in CDP. On the other hand, nesting unification alone reduces the execution time by 30 percent. Based on those results, we conclude that clause assembly should be applied to DSP using Esper but should not be applied to CDP using Hive.
Keywords :
query processing; very large databases; CDP; DSP; Esper; Hive; big data processing components; clause assembly unification; cumulative data processing; data stream processing; data transmission; interconnection services; multiquery unification; nesting unification; order sensitivity; query processing; total cost reduction; two-stage unification; unified query; virtual DFD; Assembly; Data analysis; Data models; Digital signal processing; Engines; Sensitivity; DFD; big data; component; multi-query unification; order sensitivity; platform as a service;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cloud Computing (CLOUD), 2013 IEEE Sixth International Conference on
Conference_Location :
Santa Clara, CA
Print_ISBN :
978-0-7695-5028-2
Type :
conf
DOI :
10.1109/CLOUD.2013.99
Filename :
6676703
Link To Document :
بازگشت