DocumentCode :
2322752
Title :
Towards Deploying Elastic Hadoop in the Cloud
Author :
Mao, Hong ; Zhang, Zhenzhong ; Zhao, Bin ; Xiao, Limin ; Ruan, Li
Author_Institution :
State Key Lab. of Software Dev. Environ., Beihang Univ., Beijing, China
fYear :
2011
fDate :
10-12 Oct. 2011
Firstpage :
476
Lastpage :
482
Abstract :
The fast development of internet application is boosting the development of cloud computing, a new paradigm of provisioning computing infrastructure and services over network. In cloud computing environment, MapReduce is often used to perform scientific computing like matrix multiplication and do data mining and information extraction on massive data. Hadoop, an open-source implementation of MapReduce, is a suitable tool to parallelly deal with these kinds of applications. While current hadoop environments are mainly deployed on physical servers manually and are lack of flexibility. This paper proposes the EHAD (Elastic Hadoop Auto-Deployer) system to creates/destroys corresponding number of VM nodes and deploys/releases hadoop environment among the VM nodes for client users in service level. We also propose multithreading and VMOP (Virtual Machine Optimized Placement) to improve the service quality of EHAD. Experiments show that our EHAD system can deploy a hadoop cluster on demand in less than 300 seconds. The multithread method could shorten the time consumption of creating 28 VMs by 3 times and VMOP policy could improve the runtime performance of hadoop cluster by 9.73 percent.
Keywords :
cloud computing; data mining; matrix multiplication; multi-threading; public domain software; scientific information systems; virtual machines; EHAD system; Internet application; MapReduce; VM node; VMOP policy; cloud computing; data mining; elastic Hadoop autodeployer system; information extraction; matrix multiplication; multithread method; network service; open-source implementation; physical server; scientific computing; service quality; virtual machine optimized placement; Cloud computing; Databases; Instruction sets; Multithreading; Servers; Time factors; Virtual machining; Cloud Environment; Deployment; Elastic; Hadoop; MapReduce;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), 2011 International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4577-1827-4
Type :
conf
DOI :
10.1109/CyberC.2011.83
Filename :
6079430
Link To Document :
بازگشت