Title : 
H-PFSP: Efficient Hybrid Parallel PFSP Protected Scheduling for MapReduce System
         
        
            Author : 
Yin Li ; Chuang Lin ; Fengyuan Ren ; Yifeng Geng
         
        
            Author_Institution : 
Tsinghua Nat. Lab. for Inf. Sci. & Technol. (TNList), Tsinghua Univ. Beijing, Beijing, China
         
        
        
        
        
        
            Abstract : 
MapReduce provides a data-parallel computing framework, and has emerged as a popular processing model due to the simplicity of operations for big data application developers. Data processing applications from many different domains such as search and data mining are usually developed using open-source Hadoop implementation of MapReduce or self-developed MapReduce-like implementations like Dryad [1] and Ciel [2]. In cloud environments, products like Amazon´s Elastic Compute Cloud (EC2) [3] provide MapReduce services as third-party multi-tenant service. Even within a company, a number of products may share the MapReduce cluster. Therefore, a fair and efficient scheduler is crucial to improve performance of submitted jobs and guarantee multi-user fairness. However, in practice, it is hard to guarantee both fairness and per-job performance, especially when jobs are scheduled without accurate estimation. We show that processor sharing (PS) type of schedulers like Fair Scheduling degrade the per-job performance in a multi-user environment. We present a new scheduling policy, Hybrid Parallel pessimistic Fair Schedule Protocol (H-PFSP), that can finish every job no later than Fair scheduler does. Unlike Fair scheduler, however, it can improve the per-job performance of MapReduce systems with relatively accurate job progress estimation.
         
        
            Keywords : 
parallel processing; scheduling; H-PFSP; MapReduce system; PS type schedulers; big data application developers; data processing applications; data-parallel computing framework; efficient hybrid parallel PFSP protected scheduling; fair scheduling; job progress estimation; multiuser environment; multiuser fairness; per-job performance; processor sharing; Algorithm design and analysis; Companies; Estimation; Schedules; Scheduling; Scheduling algorithms; H-PFSP; MapReduce; fair scheduling; performance;
         
        
        
        
            Conference_Titel : 
Trust, Security and Privacy in Computing and Communications (TrustCom), 2013 12th IEEE International Conference on
         
        
            Conference_Location : 
Melbourne, VIC
         
        
        
            DOI : 
10.1109/TrustCom.2013.133