DocumentCode :
2680268
Title :
Supporting dynamic space-sharing on clusters of non-dedicated workstations
Author :
Chowdhury, Abdur ; Nicklas, Lisa D. ; Setia, Sanjeev K. ; White, Elizabeth L.
Author_Institution :
Dept. of Comput. Sci., George Mason Univ., Fairfax, VA, USA
fYear :
1997
fDate :
27-30 May 1997
Firstpage :
149
Lastpage :
158
Abstract :
Clusters of workstations are increasingly being viewed as a cost effective alternative to parallel supercomputers. However, resource management and scheduling on workstations clusters is complicated by the fact that the number of idle workstations available for executing parallel applications is constantly fluctuating. We present a case for scheduling parallel applications on non dedicated workstation clusters using dynamic space sharing, a policy under which the number of processors allocated to an application can be changed during its execution. We describe an approach that uses application level checkpointing and data repartitioning for supporting dynamic space sharing and for handling the dynamic reconfiguration triggered when failure or owner activity is detected on a workstation being used by a parallel application. The performance advantages of dynamic space sharing are quantified through a simulation study, and experimental results are presented for the overhead of dynamic reconfiguration of a grid oriented data parallel application using our approach
Keywords :
parallel algorithms; processor scheduling; resource allocation; workstations; application level checkpointing; data repartitioning; dynamic reconfiguration; dynamic space sharing; grid oriented data parallel application; idle workstations; non dedicated workstation clusters; parallel application scheduling; parallel supercomputers; processor allocation; resource management; Application software; Checkpointing; Computer science; Concurrent computing; Dynamic scheduling; Fault detection; Fault tolerance; Processor scheduling; Resource management; Workstations;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Distributed Computing Systems, 1997., Proceedings of the 17th International Conference on
Conference_Location :
Baltimore, MD
ISSN :
1063-6927
Print_ISBN :
0-8186-7813-5
Type :
conf
DOI :
10.1109/ICDCS.1997.597902
Filename :
597902
Link To Document :
بازگشت