DocumentCode :
3140144
Title :
Deadline Queries: Leveraging the Cloud to Produce On-Time Results
Author :
Alves, David ; Bizarro, Pedro ; Marques, Paulo
Author_Institution :
CISUC, Univ. of Coimbra, Coimbra, Portugal
fYear :
2011
fDate :
4-9 July 2011
Firstpage :
171
Lastpage :
178
Abstract :
MapReduce has become a widely used tool for computing complex tasks that process massive amounts of data in large clusters. Support for MapReduce tasks in cloud environments has been provided but it is left to users to make best guesses on the number of nodes needed for a task to complete within acceptable time. Moreover, the time a task will take to complete is often unknown beforehand. Previous research addressed this problem by establishing time constraints for query execution and, when needed, reduce the accuracy of queries using result approximation and/or sampling. However, in many situations reduced accuracy is not tolerable. In this paper we present Flood DQ, a MapReduce system that implements deadline queries -- queries that must finish before a deadline, never discarding data or reducing accuracy. Flood DQ produces timely, accurate results by adaptively increasing or decreasing computing power, at runtime, towards completing execution within the specified deadline. In Flood DQ, users only specify a deadline and the input data. The system monitors the progress of the task and extrapolates whether it will complete on time. If the task is deemed to complete after the specified time, the system requests more nodes from an IaaS Cloud provider, and adds them to the computation. On the other hand, if the task is deemed to complete before the specified time the system quiesces and releases surplus nodes, cutting costs to a minimum. This paper describes FloodDQ´s architecture for supporting deadline queries and presents experimental results where the system always meets the deadline in spite of changes to the number of nodes, size of data or existence of perturbations.
Keywords :
approximation theory; cloud computing; extrapolation; query processing; sampling methods; FloodDQ architecture; IaaS cloud provider; MapReduce; approximation; cost cutting; deadline query; extrapolation; query execution; sampling method; time constraints; Accuracy; Computer architecture; Data processing; Estimation; Pipelines; Routing; Runtime; Cloud Computing; Deadline queries; Elastic MapReduce;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cloud Computing (CLOUD), 2011 IEEE International Conference on
Conference_Location :
Washington, DC
ISSN :
2159-6182
Print_ISBN :
978-1-4577-0836-7
Electronic_ISBN :
2159-6182
Type :
conf
DOI :
10.1109/CLOUD.2011.12
Filename :
6008707
Link To Document :
بازگشت