Title :
Statistical Performance Comparisons of Computers
Author :
Tianshi Chen ; Qi Guo ; Temam, Olivier ; Yue Wu ; Yungang Bao ; Zhiwei Xu ; Yunji Chen
Author_Institution :
State Key Lab. of Comput. Archit., Inst. of Comput. Technol., Beijing, China
Abstract :
As a fundamental task in computer architecture research, performance comparison has been continuously hampered by the variability of computer performance. In traditional performance comparisons, the impact of performance variability is usually ignored (i.e., the means of performance observations are compared regardless of the variability), or in the few cases directly addressed with i-statistics without checking the number and normality of performance observations. In this paper, we formulate a performance comparison as a statistical task, and empirically illustrate why and how common practices can lead to incorrect comparisons. We propose a non-parametric hierarchical performance testing (HPT) framework for performance comparison, which is significantly more practical than standard i-statistics because it does not require to collect a large number of performance observations in order to achieve a normal distribution of sample mean. In particular, the proposed HPT can facilitate quantitative performance comparison, in which the performance speedup of one computer over another is statistically evaluated. Compared with the HPT, a common practice which uses geometric mean performance scores to estimate the performance speedup has errors of 8.0 to 56.3 percent on SPEC CPU2006 or SPEC MPI2007, which demonstrates the necessity of using appropriate statistical techniques. This HPT framework has been implemented as an open-source software, and integrated in the PARSEC 3.0 benchmark suite.
Keywords :
computer architecture; parallel processing; public domain software; statistical analysis; HPT framework; PARSEC 3.0 benchmark suite; computer architecture research; computer performance; geometric mean performance scores; nonparametric hierarchical performance testing; normal distribution; open source software; performance observations; performance variability; quantitative performance comparison; standard t-statistics; statistical performance comparisons; statistical task; statistical techniques; Benchmark testing; Computer architecture; Computer performance; Computers; Probability distribution; Reliability; $t$ -statistics; Performance comparison; hierarchical performance testing; performance distribution;
Journal_Title :
Computers, IEEE Transactions on
DOI :
10.1109/TC.2014.2315614