DocumentCode :
3244973
Title :
A parallel and fault tolerant file system based on NFS servers
Author :
García, F. ; Calderón, A. ; Carretero, J. ; Pérez, J.M. ; Fernández, J.
Author_Institution :
Comput. Sci. Dept., Univ. Carlos III de Madrid, Leganes, Spain
fYear :
2003
fDate :
5-7 Feb. 2003
Firstpage :
83
Lastpage :
90
Abstract :
One important piece of system software for clusters is the parallel file system. All current parallel file systems and parallel I/O libraries for clusters do not use standard servers, thus it is very difficult to use these systems in heterogeneous environments. However why use proprietary or special-purpose servers on the server end of a parallel file system when you have most of the necessary functionality in NFS servers already? This paper describes the fault tolerance implemented in Expand (Expandable Parallel File System), a parallel file system based on NFS servers. Expand allows the transparent use of multiple NFS servers as a single file system, providing a single name space. The different NFS servers are combined to create a distributed partition where files are stripped. Expand requires no changes to the NFS server and uses RPC operations to provide parallel access to the same file. Expand is also independent of the clients, because all operations are implemented using RPC and NFS protocol. Using this system, we can join heterogeneous servers (Linux, Solaris, Windows 2000, etc.) to provide a parallel and distributed partition. Fault tolerance is achieved using RAID techniques applied to parallel files. The paper describes the design of Expand and the evaluation of a prototype of Expand, using the MPI-IO interface. This evaluation has been made in Linux clusters and compares Expand with PVFS.
Keywords :
RAID; file organisation; input-output programs; message passing; operating systems (computers); protocols; remote procedure calls; software fault tolerance; software performance evaluation; workstation clusters; Expand; Expandable Parallel File System; Linux clusters; MPI-IO interface; NFS protocol; NFS servers; RAID; RPC operations; clusters; evaluation; fault tolerant file system; heterogeneous servers; parallel access; system software; Computer architecture; Computer science; Fault tolerance; Fault tolerant systems; File servers; File systems; Libraries; Linux; Parallel processing; Space technology;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel, Distributed and Network-Based Processing, 2003. Proceedings. Eleventh Euromicro Conference on
Conference_Location :
Genova, Italy
ISSN :
1066-6192
Print_ISBN :
0-7695-1875-3
Type :
conf
DOI :
10.1109/EMPDP.2003.1183570
Filename :
1183570
Link To Document :
بازگشت