Title : 
Performance and Availability Tradeoffs in Replicated File Systems
         
        
            Author : 
Zhang, Jiaying ; Honeyman, Peter
         
        
            Author_Institution : 
Google, Inc, Santa Monica, CA
         
        
        
        
        
            Abstract : 
Replication is a key technique for improving fault tolerance but can introduce considerable performance overhead under some circumstances. To explore the tradeoff between performance and failure resilience, we develop a calculus that takes into consideration the I/O characteristics of applications and failure behavior of distributed storage nodes. With the developed evaluation model, we then prescribe a file system replication strategy that maximizes the utilization of computational resources for long-running and compute-intensive grid applications.
         
        
            Keywords : 
fault tolerant computing; grid computing; distributed storage nodes; failure resilience; fault tolerance; grid computing; performance overhead; replicated file systems; Availability; Calculus; Computer networks; Delay; Fault tolerance; File servers; File systems; Grid computing; Network servers; Resilience; Grid; Replication; availability; performance; tradeoff;
         
        
        
        
            Conference_Titel : 
Cluster Computing and the Grid, 2008. CCGRID '08. 8th IEEE International Symposium on
         
        
            Conference_Location : 
Lyon
         
        
            Print_ISBN : 
978-0-7695-3156-4
         
        
            Electronic_ISBN : 
978-0-7695-3156-4
         
        
        
            DOI : 
10.1109/CCGRID.2008.80