مرکز منطقه ای اطلاع رساني علوم و فناوري - Mechanisms of Optimizing MapReduce Framework on High Performance Computer

DocumentCode :

688213

Title :

Mechanisms of Optimizing MapReduce Framework on High Performance Computer

Author :

Jie Yu ; Guangming Liu ; Wei Hu ; Wenrui Dong ; Weiwei Zhang

Author_Institution :

Sch. of Comput., Nat. Univ. of Defense Technol., Changsha, China

fYear :

2013

fDate :

13-15 Nov. 2013

Firstpage :

708

Lastpage :

713

Abstract :

With the amount of data growing constantly and exponentially, the industry has encountered an unprecedented challenge of efficiently and reliably processing a tremendous amount of data. High performance computer has played a major role in the field of big data processing for its serious computational power and super-large storage. However, it remains some inevitable drawbacks to efficiently utilize the HPC due to its relatively lower availability and usability. We propose to implement MapReduce framework on HPC to solve above problems and extensively expand the application field of HPC. We design a workable plan to deploy Hadoop on HPC with a Lustre file system, and tune Lustre to a better performance based on the nature of data access in Hadoop. Virtual memory disk is proposed to efficiently buffer temporary data and store intermediate data. By taking advantage of high-speed interconnect system of HPC, the intermediate data can be transferred efficiently from map task to reduce task, which cannot be achieved in a Hadoop system on server cluster since the rate of data flow is bounded by the bandwidth of low-speed network, such as Ethernet. The evaluation driven by the standard benchmarks provided in Hadoop package shows that after applying the proposed optimization method, the Hadoop system on HPC gets better performance than Hadoop system on server cluster, especially when handle data-intensive applications.

Keywords :

Big Data; file servers; parallel processing; Ethernet; HPC; Hadoop package; Hadoop system; Lustre file system; MapReduce framework optimization mechanism; big data processing; buffer temporary data; data access; high performance computer; high-speed interconnect system; low-speed network; server cluster; super-large storage; virtual memory disk; Bandwidth; Benchmark testing; Buffer storage; Computers; File systems; Reliability; Servers;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing (HPCC_EUC), 2013 IEEE 10th International Conference on

Conference_Location :

Zhangjiajie

Type :

conf

DOI :

10.1109/HPCC.and.EUC.2013.104

Filename :

6831986

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=688213