DocumentCode :
611047
Title :
Bi-Hadoop: Extending Hadoop to Improve Support for Binary-Input Applications
Author :
Xiao Yu ; Bo Hong
fYear :
2013
fDate :
13-16 May 2013
Firstpage :
245
Lastpage :
252
Abstract :
The MapReduce programming model, along with its open-source implementation - Hadoop - has provided a cost effective solution for many data-intensive applications. Hadoop stores data distributively and exploits data locality by assigning tasks to where data is stored. Many data-intensive applications, however, require two (or more) input data for each of their tasks. Such applications pose significant challenges for Hadoop as the inputs to one task often reside on multiple nodes, and Hadoop is unable to discover data locality in this scenario. This often leads to excessive data transfers and significant degradations in application performance. In this paper, we present Bi-Hadoop, an efficient extension of Hadoop to better support binary-input applications. Bi-Hadoop integrates an easy-to-use user interface, a binary-input aware task scheduler, and a caching subsystem. Extensive experiments show that Bi-Hadoop can significantly improve the execution of binary-input applications by reducing the data transfer overhead, and outperforms existing Hadoop by up to 3.3x.
Keywords :
cache storage; data handling; public domain software; scheduling; user interfaces; Bi-Hadoop; MapReduce programming model; application performance degradation; binary-input application execution; binary-input aware task scheduler; caching subsystem; data locality; data storage; data transfer overhead reduction; data-intensive application; open-source implementation; task assignment; user interface; Data transfer; Dispatching; Scheduling algorithms; Sparse matrices; User interfaces; Vectors; Data Locality; Hadoop; MapReduce;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cluster, Cloud and Grid Computing (CCGrid), 2013 13th IEEE/ACM International Symposium on
Conference_Location :
Delft
Print_ISBN :
978-1-4673-6465-2
Type :
conf
DOI :
10.1109/CCGrid.2013.56
Filename :
6546099
Link To Document :
بازگشت