Title :
Performance Characterization of Hadoop and Data MPI Based on Amdahl´s Second Law
Author :
Fan Liang ; Chen Feng ; Xiaoyi Lu ; Zhiwei Xu
Author_Institution :
Inst. of Comput. Technol., Beijing, China
Abstract :
Amdahl´s second law has been seen as a useful guideline for designing and evaluating balanced computer systems for decades. This law has been mainly used for hardware systems and peak capacities. This paper utilizes Amdahl´s second law from a new angle, i.e., Evaluating the influence on systems performance and balance of the application framework software, a key component of big data systems. We compare two big data application framework software systems, Apache Hadoop and Data MPI, with three representative application benchmarks and various data sizes. System monitors and hardware performance counters are used to record the resource utilization, characteristics of instructions execution, memory accesses, and I/O rates. These numbers are used to reveal the three runtime metrics of Amdahl´s second law: CPU speed (GIPS), memory capacity (GB), and I/O rate (Gbps). The experiment and evaluation results show that a Data MPI-based big data system has better performance and is more balanced than a Hadoop-based system.
Keywords :
Big Data; distributed processing; input-output programs; message passing; performance evaluation; storage management; Amdahl second law; Apache Hadoop; CPU speed; DataMPI; IO rates; big data application framework software systems; big data systems; hardware performance counters; hardware systems; instructions execution; memory accesses; memory capacity; peak capacities; performance characterization; resource utilization; system monitors; Bandwidth; Benchmark testing; Big data; Hardware; Measurement; Memory management; Amdahl´s second law; Big data; DataMPI; Hadoop;
Conference_Titel :
Networking, Architecture, and Storage (NAS), 2014 9th IEEE International Conference on
Conference_Location :
Tianjin
DOI :
10.1109/NAS.2014.39