Dynamic Processing Slots Scheduling for I/O Intensive Jobs of Hadoop MapReduce

Author

Kurazumi, S. ; Tsumura, Tomoaki ; Saito, Sakuyoshi ; Matsuo, Hiroshi

Author_Institution

Nagoya Inst. of Technol., Nagoya, Japan

fYear

2012

fDate

5-7 Dec. 2012

Firstpage

288

Lastpage

292

Abstract

Hadoop, consists of Hadoop MapReduce and Hadoop Distributed File System (HDFS), is a platform for large scale data and processing. Distributed processing has become common as the number of data has been increasing rapidly worldwide and the scale of processes has become larger, so that Hadoop has attracted many cloud computing enterprises and technology enthusiasts. Hadoop users are expanding under this situation. Our studies are to develop the faster of executing jobs originated by Hadoop. In this paper, we propose dynamic processing slots scheduling for I/O intensive jobs of Hadoop MapReduce focusing on I/O wait during execution of jobs. Assigning more tasks to added free slots when CPU resources with the high rate of I/O wait have been detected on each active Task Tracker node leads to the improvement of CPU performance. We implemented our method on Hadoop 1.0.3, which results in an improvement of up to about 23% in the execution time.

Keywords

distributed databases; input-output programs; public domain software; scheduling; CPU performance improvement; CPU resources; HDES; Hadoop 1.0.3; Hadoop MapReduce; Hadoop distributed file system; I/O intensive jobs; I/O wait; active task tracker node; cloud computing enterprises; distributed processing; dynamic processing slots scheduling; Central Processing Unit; Cloud computing; Distributed databases; Dynamic scheduling; File systems; Scheduling algorithms; Hadoop; MapReduce; Scheduling algorithm; Slots Scheduling;

fLanguage

English

Publisher

ieee

Conference_Titel

Networking and Computing (ICNC), 2012 Third International Conference on

Conference_Location

Okinawa

Print_ISBN

978-1-4673-4624-5

Type

conf

DOI

10.1109/ICNC.2012.53

Filename

6424579