Title :
Impact of Parallel Download on Job Scheduling in Data Grid Environment
Author :
Zhang, Junwei ; Lee, Bu-Sung ; Tang, Xueyan ; Yeo, Chai-Kiat
Author_Institution :
Sch. of Comput. Eng., Nanyang Technol. Univ., Singapore
Abstract :
Data intensive applications, such as high energy physics, usually have a large amount of input data requires analysis. These data are often shared and replicated across the data grid. As the computing power increases, the delay caused by "waiting for input data" will become more pronounced. In this paper, we study the impact of parallel download on job scheduler performance in data grid environment. A parallel downloading system, that supports replicating data fragments and parallel downloading of replicated data fragments, is presented. The performance of the parallel downloading system is compared with non-parallel downloading system, using three scheduling heuristics: shortest turnaround time (STT), least relative load (LRL) and data present (DP). Our simulation results show that the proposed parallel download approach greatly improves the data grid performance for all three scheduling algorithms, in terms of the geometric mean of job turnaround time. The advantage of parallel downloading system is felt most when the data grid has relatively low network bandwidth and relatively high computing power.
Keywords :
grid computing; parallel processing; scheduling; data grid environment; data present scheduling; job scheduling; least relative load scheduling; parallel downloading system; replicated data fragments; shortest turnaround time scheduling; Bandwidth; Concurrent computing; Delay; File servers; Grid computing; Heuristic algorithms; Physics computing; Power engineering computing; Processor scheduling; Scheduling algorithm; Data Grid; Job Scheduling; Parallel Download; Replication;
Conference_Titel :
Grid and Cooperative Computing, 2008. GCC '08. Seventh International Conference on
Conference_Location :
Shenzhen
Print_ISBN :
978-0-7695-3449-7
DOI :
10.1109/GCC.2008.57