• DocumentCode
    2214253
  • Title

    Data conversion for process/thread migration and checkpointing

  • Author

    Jiang, Hai ; Chaudhary, Vipin ; Walters, John Paul

  • Author_Institution
    Inst. for Sci. Comput., Wayne State Univ., Detroit, MI
  • fYear
    2003
  • fDate
    9-9 Oct. 2003
  • Firstpage
    473
  • Lastpage
    480
  • Abstract
    Process/thread migration and checkpointing schemes support load balancing, load sharing and fault tolerance to improve application performance and system resource usage on workstation clusters. To enable these schemes to work in heterogeneous environments, we have developed an application-level migration and checkpointing package, MigThread, to abstract computation states at the language level for portability. To save and restore such states across different platforms, we propose a novel "receiver makes right" (RMR) data conversion method, called coarse-grain tagged RMR (CGT-RMR), for efficient data marshalling and unmarshalling. Unlike common data representation standards, CGT-RMR does not require programmers to analyze data types, flatten aggregate types, and encode/decode scalar types explicitly within programs. With help from MigThread\´s type system, CGT-RMR assigns a tag to each data type and converts nonscalar types as a whole. This speeds up the data conversion process and eases the programming task dramatically, especially for the large data trunks common to migration and checkpointing. Armed with this "plug-and-play" style data conversion scheme, MigThread has been ported to work in heterogeneous environments. Some microbenchmarks and performance measurements within the SPLASH-2 suite are given to illustrate the efficiency of the data conversion process
  • Keywords
    distributed processing; resource allocation; system recovery; workstation clusters; checkpointing; data conversion method; data marshalling; data unmarshalling; fault tolerance; heterogeneous environment; load balancing; load sharing; process migration; system resource usage; thread migration; workstation cluster; Aggregates; Checkpointing; Data analysis; Data conversion; Fault tolerant systems; Load management; Packaging; Programming profession; Workstations; Yarn;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Processing, 2003. Proceedings. 2003 International Conference on
  • Conference_Location
    Kaohsiung
  • ISSN
    0190-3918
  • Print_ISBN
    0-7695-2017-0
  • Type

    conf

  • DOI
    10.1109/ICPP.2003.1240612
  • Filename
    1240612