DocumentCode
592041
Title
Dynamic Processing Slots Scheduling for I/O Intensive Jobs of Hadoop MapReduce
Author
Kurazumi, S. ; Tsumura, Tomoaki ; Saito, Sakuyoshi ; Matsuo, Hiroshi
Author_Institution
Nagoya Inst. of Technol., Nagoya, Japan
fYear
2012
fDate
5-7 Dec. 2012
Firstpage
288
Lastpage
292
Abstract
Hadoop, consists of Hadoop MapReduce and Hadoop Distributed File System (HDFS), is a platform for large scale data and processing. Distributed processing has become common as the number of data has been increasing rapidly worldwide and the scale of processes has become larger, so that Hadoop has attracted many cloud computing enterprises and technology enthusiasts. Hadoop users are expanding under this situation. Our studies are to develop the faster of executing jobs originated by Hadoop. In this paper, we propose dynamic processing slots scheduling for I/O intensive jobs of Hadoop MapReduce focusing on I/O wait during execution of jobs. Assigning more tasks to added free slots when CPU resources with the high rate of I/O wait have been detected on each active Task Tracker node leads to the improvement of CPU performance. We implemented our method on Hadoop 1.0.3, which results in an improvement of up to about 23% in the execution time.
Keywords
distributed databases; input-output programs; public domain software; scheduling; CPU performance improvement; CPU resources; HDES; Hadoop 1.0.3; Hadoop MapReduce; Hadoop distributed file system; I/O intensive jobs; I/O wait; active task tracker node; cloud computing enterprises; distributed processing; dynamic processing slots scheduling; Central Processing Unit; Cloud computing; Distributed databases; Dynamic scheduling; File systems; Scheduling algorithms; Hadoop; MapReduce; Scheduling algorithm; Slots Scheduling;
fLanguage
English
Publisher
ieee
Conference_Titel
Networking and Computing (ICNC), 2012 Third International Conference on
Conference_Location
Okinawa
Print_ISBN
978-1-4673-4624-5
Type
conf
DOI
10.1109/ICNC.2012.53
Filename
6424579
Link To Document