• DocumentCode
    560220
  • Title

    Cloud versus in-house cluster: Evaluating Amazon cluster compute instances for running MPI applications

  • Author

    Zhai, Yan ; Liu, Mingliang ; Zhai, Jidong ; Ma, Xiaosong ; Chen, Wenguang

  • Author_Institution
    Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
  • fYear
    2011
  • fDate
    12-18 Nov. 2011
  • Firstpage
    1
  • Lastpage
    10
  • Abstract
    The emergence of cloud services brings new possibilities for constructing and using HPC platforms. However, while cloud services provide the flexibility and convenience of customized, pay-as-you-go parallel computing, multiple previous studies in the past three years have indicated that cloud- based clusters need a significant performance boost to be- come a competitive choice, especially for tightly coupled parallel applications. In this work, we examine the feasibility of running HPC applications in clouds. This study distinguishes itself from existing investigations in several ways: 1) We carry out a comprehensive examination of issues relevant to the HPC community, including performance, cost, user experience, and range of user activities. 2) We compare an Amazon EC2-based platform built upon its newly available HPC- oriented virtual machines with typical local cluster and supercomputer options, using benchmarks and applications with scale and problem size unprecedented in previous cloud HPC studies. 3) We perform detailed performance and scalability analysis to locate the chief limiting factors of the state-of-the-art cloud based clusters. 4) We present a case study on the impact of per-application parallel I/O system configuration uniquely enabled by cloud services. Our results reveal that though the scalability of EC2-based virtual clusters still lags behind traditional HPC alternatives, they are rapidly gaining in overall performance and cost-effectiveness, making them feasible candidates for per- forming tightly coupled scientific computing. In addition, our detailed benchmarking and profiling discloses and analyzes several problems regarding the performance and performance stability on EC2.
  • Keywords
    application program interfaces; cloud computing; message passing; parallel processing; pattern clustering; virtual machines; Amazon EC2-based platform; Amazon cluster compute instance evaluation; EC2-based virtual clusters; HPC platforms; HPC-oriented virtual machines; MPI applications; cloud services; cloud-based clusters; cost; in-house cluster; local cluster; parallel I-O system configuration; parallel computing; performance; scientific computing; supercomputer; user activities; user experience; Bandwidth; Benchmark testing; Cloud computing; Hardware; Servers; Stability analysis; Supercomputers;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing, Networking, Storage and Analysis (SC), 2011 International Conference for
  • Conference_Location
    Seatle, WA
  • Electronic_ISBN
    978-1-4503-0771-0
  • Type

    conf

  • Filename
    6114491