DocumentCode
3023595
Title
Towards efficient supercomputing: a quest for the right metric
Author
Hsu, Chung-Hsing ; Feng, Wu-chun ; Archuleta, Jeremy S.
Author_Institution
Los Alamos Nat. Lab., NM, USA
fYear
2005
fDate
4-8 April 2005
Abstract
Over the past decade, we have been building less and less efficient supercomputers, resulting in the construction of substantially larger machine rooms and even new buildings. In addition, because of the thermal power envelope of these supercomputers, a small fortune must be spent to cool them. These infrastructure costs coupled with the additional costs of administering and maintaining such (unreliable) supercomputers dramatically increases their total cost of ownership. As a result, there has been substantial interest in recent years to produce more reliable and more efficient supercomputers that are easy to maintain and use. But how does one quantify efficient supercomputing? That is, what metric should be used to evaluate how efficiently a supercomputer delivers answers? We argue that existing efficiency metrics such as the performance-power ratio are insufficient and motivate the need for a new type of efficiency metric, one that incorporates notions of reliability, availability, productivity, and total cost of ownership (TCO), for instance. In doing so, however, this paper raises more questions than it answers with respect to efficiency. And in the end, we still return to the performance-power ratio as an efficiency metric with respect to power and use it to evaluate a menagerie of processor platforms in order to provide a set of reference data points for the high-performance computing community.
Keywords
computer maintenance; parallel machines; performance evaluation; power consumption; high-performance computing; performance-power ratio; supercomputer maintenance; supercomputing; thermal power envelope; Availability; Buildings; Cooling; Costs; Energy consumption; High performance computing; Laboratories; Maintenance; Productivity; Supercomputers;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel and Distributed Processing Symposium, 2005. Proceedings. 19th IEEE International
Print_ISBN
0-7695-2312-9
Type
conf
DOI
10.1109/IPDPS.2005.440
Filename
1420148
Link To Document