DocumentCode
2555097
Title
A novel parallel hybrid PSO-GA using MapReduce to schedule jobs in Hadoop data grids
Author
Sadasivam, Sudha G. ; Selvaraj, Dharini
Author_Institution
Dept. of CSE, PSG Coll. of Technol., Coimbatore, India
fYear
2010
fDate
15-17 Dec. 2010
Firstpage
377
Lastpage
382
Abstract
Scheduling heterogeneous tasks in a heterogeneous grid environment aims at effectively utilizing the resources and sharing the load among the available resources. Such a task assignment problem is NP-hard. This paper presents a Hybrid Particle Swarm Optimization - Genetic Algorithm (HPSO-GA) for solving the Task Assignment Problem. The novel Particle Swarm Optimization (PSO) implements GA operations such as crossover and mutation in PSO to improve effective resource utilization and complete tasks within deadline. The algorithm aims at distributing load among the heterogeneous resources in the grid environment based on their capacity. Analysis of data and computation intensive applications like web log processing and bioinformatics to achieve optimal performance is time consuming. Hence parallelization of optimization function is essential. Large-scale parallellisation of optimization function must also guarantee efficient communication, load balancing, fault tolerance and reliability. This paper presents a MapReduce HPSO-GA based on MapReduce parallel programming model. The HPSO-GA yields better results than normal PSO, provides better load balancing and resource utilization in grid environment. It identifies the exact node to which a task can be assigned in a Hadoop cluster. Hence, the proposed approach can be used in the resource management system of Hadoop along with Hadoop and system parameters to schedule jobs efficiently in a Hadoop cluster.
Keywords
data analysis; fault tolerant computing; genetic algorithms; grid computing; particle swarm optimisation; pattern clustering; processor scheduling; resource allocation; task analysis; Hadoop cluster; Hadoop data grid; MapReduce; NP-hard problem; data analysis; fault tolerance; genetic algorithm; heterogeneous grid environment; hybrid particle swarm optimization; job scheduling; large-scale parallellisation; load balancing; load distribution; optimization function; reliability; resource management system; resource utilization; task assignment problem; Gallium; Program processors; HPSO-GA; Hadoop; MapReduce; cluster performance; scheduler;
fLanguage
English
Publisher
ieee
Conference_Titel
Nature and Biologically Inspired Computing (NaBIC), 2010 Second World Congress on
Conference_Location
Fukuoka
Print_ISBN
978-1-4244-7377-9
Type
conf
DOI
10.1109/NABIC.2010.5716346
Filename
5716346
Link To Document