Title :
BDAP: A Big Data Placement Strategy for Cloud-Based Scientific Workflows
Author :
Ebrahimi, Mahdi ; Mohan, Aravind ; Kashlev, Andrey ; Shiyong Lu
Author_Institution :
Wayne State Univ., Detroit, MI, USA
fDate :
March 30 2015-April 2 2015
Abstract :
In this new era of Big Data, there is a growing need to enable scientific workflows to perform computations at a scale far exceeding a single workstation´s capabilities. When running such data intensive workflows in the cloud distributed across several physical locations, the execution time and the resource utilization efficiency highly depends on the initial placement and distribution of the input datasets across these multiple virtual machines in the Cloud. In this paper, we propose BDAP (Big DAta Placement strategy), a strategy that improves workflow performance by minimizing data movement across multiple virtual machines. In this work, we 1) formalize the data placement problem in scientific workflows, 2) propose a data placement algorithm that considers both initial input dataset and intermediate datasets obtained during workflow run, and 3) perform extensive experiments in the distributed environment to verify that our proposed strategy provides an effective data placement solution to distribute and place big datasets at the appropriate virtual machines in the Cloud within reasonable time.
Keywords :
Big Data; cloud computing; resource allocation; virtual machines; workflow management software; BDAP; big data placement strategy; cloud-based scientific workflows; data intensive workflows; resource utilization; virtual machines; Big data; Cloud computing; Computational modeling; Data models; Distributed databases; Resource management; Virtual machining; Big Data; Cloud Computing; Data Placement; Evolutionary Algorithm; Scientific Workflow;
Conference_Titel :
Big Data Computing Service and Applications (BigDataService), 2015 IEEE First International Conference on
Conference_Location :
Redwood City, CA
DOI :
10.1109/BigDataService.2015.70