DocumentCode :
1054482
Title :
A Benchmark Comparison of Three Supercomputers: Fujitsu VP-200, Hitachi S810/120, and Cray X-MP/2
Author :
Lubeck, Olaf ; Moore, James ; Mendez, Raul
Author_Institution :
Computing and Communications Division, Los Alamos National Laboratory
Volume :
18
Issue :
12
fYear :
1985
Firstpage :
10
Lastpage :
24
Abstract :
The authors report on their opportunity to benchmark the VP-200 (benchmarked at the Fujitsu plant in Numazu, Japan in May 1984 and more recently in May 1985 at Amdahl); the Hitachi S810/20 (made available at the Large Scale Computation Center of Tokyo University in May 1984); and Cray Research provided time on an X-MP running COS and version 1.13 of their CFT compiler. Other X-MP benchmarks were conducted at Los Alamos. For the most part, the codes that were executed on the machines came from the Los Alamos benchmark set. The set is composed of programs that typify the Los Alamos workload. It was found that the VP-200 can be two to three times as fast as the X-MP/2 in vector mode for large vector lengths. Results from highly vectorized codes (rudimentary matrix operations and a linear equations solver) as well as the timings from basic vector operations support this conclusion. On the codes that are more indicative of the Los Alamos workload, the VP-200 and X-MP/2 are comparable. The times for BMK5 (0% vectorized) were virtually equivalent; BMK21 (0% vectorized) was executed on the VP-200 18% faster than on the X-MP; BMK1lb (62% vectorized) and BMK21a (18% vectorized) favored the VP-200 by 17% and 24%, respectively; and SIMPLE (93% vectorized) favored the X-MP by 25%. The authors suggest three reasons for the parity: (1) from the standpoint of Amdahl´s Law, many of the codes are dominated by the equivalent scalar performance of the machines, (2) even in the cases of higher degrees of vectorization, vector lengths less than 100 perform equivalently, and (3) function calls prevented full vectorization on the VP-200 in some cases (for example, BMK1la). The Hitachi S810/20 does not perform as well as the other two on the benchmark codes, probably because its scalar performance and vector processor clock period are slower. Additionally, the benchmark codes could not make use of the large number of functional units in the S810/20.
Keywords :
Analytical models; Benchmark testing; Large-scale systems; Performance gain; Supercomputers; Time measurement; Vector processors;
fLanguage :
English
Journal_Title :
Computer
Publisher :
ieee
ISSN :
0018-9162
Type :
jour
DOI :
10.1109/MC.1985.1662769
Filename :
1662769
Link To Document :
بازگشت