• DocumentCode
    3525045
  • Title

    BDAP: A Big Data Placement Strategy for Cloud-Based Scientific Workflows

  • Author

    Ebrahimi, Mahdi ; Mohan, Aravind ; Kashlev, Andrey ; Shiyong Lu

  • Author_Institution
    Wayne State Univ., Detroit, MI, USA
  • fYear
    2015
  • fDate
    March 30 2015-April 2 2015
  • Firstpage
    105
  • Lastpage
    114
  • Abstract
    In this new era of Big Data, there is a growing need to enable scientific workflows to perform computations at a scale far exceeding a single workstation´s capabilities. When running such data intensive workflows in the cloud distributed across several physical locations, the execution time and the resource utilization efficiency highly depends on the initial placement and distribution of the input datasets across these multiple virtual machines in the Cloud. In this paper, we propose BDAP (Big DAta Placement strategy), a strategy that improves workflow performance by minimizing data movement across multiple virtual machines. In this work, we 1) formalize the data placement problem in scientific workflows, 2) propose a data placement algorithm that considers both initial input dataset and intermediate datasets obtained during workflow run, and 3) perform extensive experiments in the distributed environment to verify that our proposed strategy provides an effective data placement solution to distribute and place big datasets at the appropriate virtual machines in the Cloud within reasonable time.
  • Keywords
    Big Data; cloud computing; resource allocation; virtual machines; workflow management software; BDAP; big data placement strategy; cloud-based scientific workflows; data intensive workflows; resource utilization; virtual machines; Big data; Cloud computing; Computational modeling; Data models; Distributed databases; Resource management; Virtual machining; Big Data; Cloud Computing; Data Placement; Evolutionary Algorithm; Scientific Workflow;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Big Data Computing Service and Applications (BigDataService), 2015 IEEE First International Conference on
  • Conference_Location
    Redwood City, CA
  • Type

    conf

  • DOI
    10.1109/BigDataService.2015.70
  • Filename
    7184870