DocumentCode :
2555097
Title :
A novel parallel hybrid PSO-GA using MapReduce to schedule jobs in Hadoop data grids
Author :
Sadasivam, Sudha G. ; Selvaraj, Dharini
Author_Institution :
Dept. of CSE, PSG Coll. of Technol., Coimbatore, India
fYear :
2010
fDate :
15-17 Dec. 2010
Firstpage :
377
Lastpage :
382
Abstract :
Scheduling heterogeneous tasks in a heterogeneous grid environment aims at effectively utilizing the resources and sharing the load among the available resources. Such a task assignment problem is NP-hard. This paper presents a Hybrid Particle Swarm Optimization - Genetic Algorithm (HPSO-GA) for solving the Task Assignment Problem. The novel Particle Swarm Optimization (PSO) implements GA operations such as crossover and mutation in PSO to improve effective resource utilization and complete tasks within deadline. The algorithm aims at distributing load among the heterogeneous resources in the grid environment based on their capacity. Analysis of data and computation intensive applications like web log processing and bioinformatics to achieve optimal performance is time consuming. Hence parallelization of optimization function is essential. Large-scale parallellisation of optimization function must also guarantee efficient communication, load balancing, fault tolerance and reliability. This paper presents a MapReduce HPSO-GA based on MapReduce parallel programming model. The HPSO-GA yields better results than normal PSO, provides better load balancing and resource utilization in grid environment. It identifies the exact node to which a task can be assigned in a Hadoop cluster. Hence, the proposed approach can be used in the resource management system of Hadoop along with Hadoop and system parameters to schedule jobs efficiently in a Hadoop cluster.
Keywords :
data analysis; fault tolerant computing; genetic algorithms; grid computing; particle swarm optimisation; pattern clustering; processor scheduling; resource allocation; task analysis; Hadoop cluster; Hadoop data grid; MapReduce; NP-hard problem; data analysis; fault tolerance; genetic algorithm; heterogeneous grid environment; hybrid particle swarm optimization; job scheduling; large-scale parallellisation; load balancing; load distribution; optimization function; reliability; resource management system; resource utilization; task assignment problem; Gallium; Program processors; HPSO-GA; Hadoop; MapReduce; cluster performance; scheduler;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Nature and Biologically Inspired Computing (NaBIC), 2010 Second World Congress on
Conference_Location :
Fukuoka
Print_ISBN :
978-1-4244-7377-9
Type :
conf
DOI :
10.1109/NABIC.2010.5716346
Filename :
5716346
Link To Document :
بازگشت