DocumentCode :
3390521
Title :
Nomad: a scalable operating system for clusters of uni- and multiprocessors
Author :
Pinheiro, Eduardo ; Bianchini, Ricardo
Author_Institution :
COPPE Syst. Eng., Fed.. Univ. of Rio de Janeiro, Brazil
fYear :
1999
fDate :
1999
Firstpage :
247
Lastpage :
254
Abstract :
The recent improvements in workstation and interconnection network performance have popularized the clusters of off-the-shelf workstations. However, the usefulness of these clusters is yet to be fully exploited, mostly due to the inadequate management of cluster resources implemented by current distributed operating systems. In order to eliminate this problem and approach the computational power of large clusters of workstations, in this paper we propose Nomad, an efficient operating system for clusters of uni and/or multiprocessors. Nomad includes several important characteristics for modern cluster-oriented operating systems: scalability, efficient resource management across the cluster, efficient scheduling of parallel and distributed applications, distributed I/O, fault detection and recovery, protection, and backward compatibility. Some of the mechanisms used by Nomad, such as process checkpointing and migration, can be found in previously proposed systems. However, our system stands out for its strategy for disseminating information across the cluster and its efficient management of all cluster resources. In addition, Nomad is highly scalable as it uses neither centralized control nor extra messages to implement its functionality, taking advantage of the I/O traffic associated with its distributed file system. Our preliminary evaluation of the load balancing aspect of Nomad shows that the pattern of file accesses in our distributed Ale system allows for efficient and scalable load balancing. Our main conclusion is that the complete implementation of Nomad will most likely be efficient and will be a nice platform for future research on operating systems for clusters of workstations
Keywords :
multiprocessing systems; network operating systems; performance evaluation; resource allocation; workstation clusters; Nomad; backward compatibility; clusters of workstations; distributed operating systems; interconnection network performance; multiprocessors cluster; process checkpointing; resource management; scalability; scalable operating system; Fault detection; Load management; Multiprocessor interconnection networks; Operating systems; Power system management; Processor scheduling; Protection; Resource management; Scalability; Workstations;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cluster Computing, 1999. Proceedings. 1st IEEE Computer Society International Workshop on
Conference_Location :
Melbourne, Vic.
Print_ISBN :
0-7695-0343-8
Type :
conf
DOI :
10.1109/IWCC.1999.810831
Filename :
810831
Link To Document :
بازگشت