Title :
Smart Intermediate Data Transfer for MapReduce on Cloud Computing
Author :
Tzu-Chi Huang ; Kuo-Chih Chu ; Yu-Ruei Rao
Author_Institution :
Dept. of Electron. Eng., Lunghwa Univ. of Sci. & Technol., Taoyuan, Taiwan
Abstract :
MapReduce is a programming model proposed by Google to process large datasets in clusters. However, MapReduce often needs to transfer much intermediate data among nodes, which is harmful to performances of an application. MapReduce can be enhanced by using the proposed Smart Intermediate Data Transfer (SIDT) in the runtime system to smartly arrange intermediate data. Although SIDT does not reduce intermediate data to the minimal size in comparison with other intermediate data arrangement procedures such as Huffman coding, bzip2, and gzip, MapReduce is proved to get a better performance from SIDT than from others in the experiments of this paper.
Keywords :
cloud computing; distributed programming; Huffman coding; MapReduce programming model; SIDT; bzip2; cloud computing; gzip; intermediate data arrangement procedures; smart intermediate data transfer; Bandwidth; Cloud computing; Data transfer; Decoding; Huffman coding; Programming; Runtime; Cloud Computing; Intermediate Data; MapReduce; SIDT;
Conference_Titel :
Cloud Computing and Big Data (CloudCom-Asia), 2013 International Conference on
Conference_Location :
Fuzhou
Print_ISBN :
978-1-4799-2829-3
DOI :
10.1109/CLOUDCOM-ASIA.2013.97