DocumentCode :
688213
Title :
Mechanisms of Optimizing MapReduce Framework on High Performance Computer
Author :
Jie Yu ; Guangming Liu ; Wei Hu ; Wenrui Dong ; Weiwei Zhang
Author_Institution :
Sch. of Comput., Nat. Univ. of Defense Technol., Changsha, China
fYear :
2013
fDate :
13-15 Nov. 2013
Firstpage :
708
Lastpage :
713
Abstract :
With the amount of data growing constantly and exponentially, the industry has encountered an unprecedented challenge of efficiently and reliably processing a tremendous amount of data. High performance computer has played a major role in the field of big data processing for its serious computational power and super-large storage. However, it remains some inevitable drawbacks to efficiently utilize the HPC due to its relatively lower availability and usability. We propose to implement MapReduce framework on HPC to solve above problems and extensively expand the application field of HPC. We design a workable plan to deploy Hadoop on HPC with a Lustre file system, and tune Lustre to a better performance based on the nature of data access in Hadoop. Virtual memory disk is proposed to efficiently buffer temporary data and store intermediate data. By taking advantage of high-speed interconnect system of HPC, the intermediate data can be transferred efficiently from map task to reduce task, which cannot be achieved in a Hadoop system on server cluster since the rate of data flow is bounded by the bandwidth of low-speed network, such as Ethernet. The evaluation driven by the standard benchmarks provided in Hadoop package shows that after applying the proposed optimization method, the Hadoop system on HPC gets better performance than Hadoop system on server cluster, especially when handle data-intensive applications.
Keywords :
Big Data; file servers; parallel processing; Ethernet; HPC; Hadoop package; Hadoop system; Lustre file system; MapReduce framework optimization mechanism; big data processing; buffer temporary data; data access; high performance computer; high-speed interconnect system; low-speed network; server cluster; super-large storage; virtual memory disk; Bandwidth; Benchmark testing; Buffer storage; Computers; File systems; Reliability; Servers;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing (HPCC_EUC), 2013 IEEE 10th International Conference on
Conference_Location :
Zhangjiajie
Type :
conf
DOI :
10.1109/HPCC.and.EUC.2013.104
Filename :
6831986
Link To Document :
بازگشت