DocumentCode
2214253
Title
Data conversion for process/thread migration and checkpointing
Author
Jiang, Hai ; Chaudhary, Vipin ; Walters, John Paul
Author_Institution
Inst. for Sci. Comput., Wayne State Univ., Detroit, MI
fYear
2003
fDate
9-9 Oct. 2003
Firstpage
473
Lastpage
480
Abstract
Process/thread migration and checkpointing schemes support load balancing, load sharing and fault tolerance to improve application performance and system resource usage on workstation clusters. To enable these schemes to work in heterogeneous environments, we have developed an application-level migration and checkpointing package, MigThread, to abstract computation states at the language level for portability. To save and restore such states across different platforms, we propose a novel "receiver makes right" (RMR) data conversion method, called coarse-grain tagged RMR (CGT-RMR), for efficient data marshalling and unmarshalling. Unlike common data representation standards, CGT-RMR does not require programmers to analyze data types, flatten aggregate types, and encode/decode scalar types explicitly within programs. With help from MigThread\´s type system, CGT-RMR assigns a tag to each data type and converts nonscalar types as a whole. This speeds up the data conversion process and eases the programming task dramatically, especially for the large data trunks common to migration and checkpointing. Armed with this "plug-and-play" style data conversion scheme, MigThread has been ported to work in heterogeneous environments. Some microbenchmarks and performance measurements within the SPLASH-2 suite are given to illustrate the efficiency of the data conversion process
Keywords
distributed processing; resource allocation; system recovery; workstation clusters; checkpointing; data conversion method; data marshalling; data unmarshalling; fault tolerance; heterogeneous environment; load balancing; load sharing; process migration; system resource usage; thread migration; workstation cluster; Aggregates; Checkpointing; Data analysis; Data conversion; Fault tolerant systems; Load management; Packaging; Programming profession; Workstations; Yarn;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel Processing, 2003. Proceedings. 2003 International Conference on
Conference_Location
Kaohsiung
ISSN
0190-3918
Print_ISBN
0-7695-2017-0
Type
conf
DOI
10.1109/ICPP.2003.1240612
Filename
1240612
Link To Document