Abstract :
Commercial clouds have increasingly become a viable platform for hosting scientific analyses and computation due to their elasticity, recent introduction of specialist hardware, and pay-as-you-go cost model. This computing paradigm therefore presents a low capital and low barrier alternative to operating dedicated eScience infrastructure. Indeed, commercial clouds now enable universal access to capabilities previously available to only large well funded research groups. While the potential benefits of cloud computing are clear, there are still significant technical hurdles associated with obtaining the best execution efficiency whilst trading off cost. Large scale scientific analyses are typically represented as workflows, in order to manage multiple tools and data sets. Mapping workflow tasks on to a set of provisioned instances is an example of the general scheduling problem and is NP-complete. In this case, the mapping includes elasticity, where as part of the mapping process additional instances may be provisioned. In this paper we present anew algorithm, Proportional Deadline Constrained (PDC), that addresses eScience workflow scheduling in the cloud. PDC´s aim is to minimize costs while meeting deadline constraints. To validate the PDC algorithm, we constructed a Cloud Sim test bed and compared PDC with two other similar algorithms over three workflows. Our results demonstrate that overall PDC achieves generally lower costs for a given deadline, but more significantly, is usually able to construct a viable schedule with tight deadlines where the other algorithms studied cannot.
Keywords :
"Cloud computing","Scheduling","Schedules","Partitioning algorithms","Scheduling algorithms","Quality of service"