Title :
Scheduling Data on Data-Driven Master/Worker Platform
Author :
Labidi, Mohamed ; Bing Tang ; Fedak, Gilles ; Khemakem, Maher ; Jemni, Mohamed
Author_Institution :
LaTICE Lab., Univ. of Tunis, Tunis, Tunisia
Abstract :
With data intensive applications it can be interesting to resort to a distributed storage to reach scalability and avoid data-intensive problems. Storing data permanently on computing nodes can be an interesting approach especially with the frequent use and the large volume of this data. Moreover, processing large data is a computing intensive task which encourages parallel execution. Nevertheless, data placement on computing nodes should be optimal to reach load balancing. In this work, we investigate scheduling heuristics towards the optimization of data distribution on the computing nodes. Motivated by its capacity to control perfectly the common operations associated with data management, we use BitDew: a desktop grid middleware designed for large scale data management. With BitDew, we build a Data-Driven Master/Worker Platform to carry out the distribution of Magick, the OCR application based on Dynamic Time Warping (DTW) algorithm. We evaluate the benefit of the implementation of studied scheduling heuristics to achieve load balancing with both homogeneous and heterogeneous environment. We present experimental results which demonstrate the efficiency of our approach.
Keywords :
grid computing; middleware; optical character recognition; parallel processing; processor scheduling; resource allocation; BitDew; DTW algorithm; Magick; OCR application; computing intensive task; computing nodes; data distribution; data intensive applications; data placement; data scheduling; data-driven master-worker platform; data-intensive problems; desktop grid middleware; distributed storage; dynamic time warping algorithm; heterogeneous environment; homogeneous environment; large scale data management; load balancing; parallel execution; Image recognition; Load management; Optical character recognition software; Processor scheduling; Scheduling; Servers; Time factors; BitDew; Data scheduling; Large Scale OCR; Scheduling heuristics;
Conference_Titel :
Parallel and Distributed Computing, Applications and Technologies (PDCAT), 2012 13th International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-0-7695-4879-1
DOI :
10.1109/PDCAT.2012.122