• DocumentCode
    734367
  • Title

    Running MPI Applications over an Opportunistic Infrastructure

  • Author

    Bohorquez, Eliana ; Rosales, Eduardo ; Castro, Harold

  • Author_Institution
    Syst. & Comput. Eng. Dept., Univ. de los Andes, Andes, Colombia
  • fYear
    2015
  • fDate
    8-10 July 2015
  • Firstpage
    446
  • Lastpage
    453
  • Abstract
    We propose a method based on Open MPI and BLCR checkpoints to allow executing MPI applications over non-dedicated and failure-prone computing infrastructures. To this end, the method allows automatic detection and recovery of MPI applications in case of failures while generating minimum overhead to the overall execution process. The method was tested by using Una Cloud, an opportunistic Cloud Computing IaaS implementation which provides private clouds supported by idle computing resources available in computer laboratories from a university campus. The tests were performed by executing a Simple Ray Tracing MPI application which rendering operations required several hours of processing and intercommunication among nodes. The results show that the proposed method can be effectively used to run MPI applications through the use of checkpoint/restart recovery techniques even if the supporting infrastructure exhibits high volatility.
  • Keywords
    application program interfaces; checkpointing; cloud computing; failure analysis; message passing; ray tracing; rendering (computer graphics); BLCR checkpoint; Una cloud; automatic detection and recovery; checkpoint/restart recovery technique; cloud computing IaaS implementation; computer laboratory; failure-prone computing infrastructure; open MPI; opportunistic infrastructure; private cloud; ray tracing MPI application; rendering operation; university campus; Checkpointing; Cloud computing; Fault tolerance; Fault tolerant systems; Laboratories; Ray tracing; BLCR; Checkpoint/Recovery Systems; Checkpoint/Restart; Cloud computing; MPI; OpenMPI; UnaCloud;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Complex, Intelligent, and Software Intensive Systems (CISIS), 2015 Ninth International Conference on
  • Conference_Location
    Blumenau
  • Print_ISBN
    978-1-4799-8869-3
  • Type

    conf

  • DOI
    10.1109/CISIS.2015.65
  • Filename
    7185229