Title :
Converting a High Performance Application to an Elastic Cloud Application
Author :
Rajan, Dinesh ; Canino, Anthony ; Izaguirre, Jesus A. ; Thain, Douglas
Author_Institution :
Dept. of Comput. Sci. & Eng., Univ. of Notre Dame, Notre Dame, IN, USA
fDate :
Nov. 29 2011-Dec. 1 2011
Abstract :
Over the past decade, high performance applications have embraced parallel programming and computing models. While parallel computing offers advantages such as good utilization of dedicated hardware resources, it also has several drawbacks such as poor fault-tolerance, scalability, and ability to harness available resources during run-time. The advent of cloud computing presents a viable and promising alternative to parallel computing because of its advantages in offering a distributed computing model. In this work, we establish directives that serve as guidelines for the design and implementation or identification of a suitable cloud computing framework to build or convert a high performance application to run in the cloud. We show that following these directives leads to an elastic implementation that has better scalability, run-time resource adaptability, fault tolerance, and portability across cloud computing platforms, while requiring minimal effort and intervention from the user. We illustrate this by converting an MPI implementation of replica exchange, a parallel tempering molecular dynamics application, to an elastic cloud application using the Work Queue framework that adheres to these directive. We observe better scalability and resource adaptability of this elastic application on multiple platforms, including a homogeneous cluster environment (SGE) and heterogeneous cloud computing environments such as Microsoft Azure and Amazon EC2.
Keywords :
cloud computing; message passing; parallel programming; Amazon EC2; MPI implementation; Microsoft Azure; cloud computing framework; distributed computing model; elastic cloud application; fault tolerance; heterogeneous cloud computing environments; high performance application; homogeneous cluster environment; parallel computing; parallel programming; parallel tempering molecular dynamics application; portability; run-time resource adaptability; work queue framework; Cloud computing; Computational modeling; Fault tolerance; Fault tolerant systems; Hardware; Parallel processing; Scalability; Elastic applications; cloud computing; distributed computing; high performance applications; molecular dynamics; replica exchange;
Conference_Titel :
Cloud Computing Technology and Science (CloudCom), 2011 IEEE Third International Conference on
Conference_Location :
Athens
Print_ISBN :
978-1-4673-0090-2
DOI :
10.1109/CloudCom.2011.58