• DocumentCode
    3414561
  • Title

    Architecture and performance of the Hitachi SR2201 massively parallel processor system

  • Author

    Fujii, Hiroaki ; Yasuda, Yoshiko ; Akashi, Hideya ; Inagami, Yasuhiro ; Koga, Makoto ; Ishihara, Osamu ; Kashiyama, Masamori ; Wada, Hideo ; Sumimoto, Tsutomu

  • Author_Institution
    Central Res. Lab., Hitachi Ltd., Kokubunji, Japan
  • fYear
    1997
  • fDate
    1-5 Apr 1997
  • Firstpage
    233
  • Lastpage
    241
  • Abstract
    RISC-based Massively Parallel Processors (MPPs) often show low efficiency in real-world applications because of cache miss penalty, insufficient throughput of the memory system, and poor inter-processor communication performance. Hitachi´s SR2201, an MPP scalable up to 2048 processors and 600 GFLOPS peak performance, overcomes these problems by introducing three novel features. First, its processor the 150 MHz HARP-IE, solves the cache miss penalty by “pseudo vector processing” (PVP). In PVP, data is loaded by prefetching to a special register bank, bypassing the cache. Second, a multi-bank memory architecture that operates like a pipeline eliminates the memory system bottleneck. Third, the inter-processor communication achieves high performance on the three-dimensional crossbar network, using a “remote DMA transfer” protocol and a hardware-based cache coherency. As the result of these improvements, the SR2201 achieved 220.4 GFLOPS with 1024 processors in the LINPACK benchmark, which is almost 72% of the peak performance
  • Keywords
    parallel architectures; parallel processing; performance evaluation; reduced instruction set computing; 150 MHz HARP-IE; Hitachi SR2201 massively parallel processor system; LINPACK benchmark; RISC-based processors; cache miss penalty; hardware-based cache coherency; inter-processor communication performance; memory system bottleneck; multi-bank memory architecture; protocol; Application software; Cache memory; Computer architecture; Concurrent computing; Degradation; Laboratories; Prefetching; Reduced instruction set computing; Software performance; Throughput;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Processing Symposium, 1997. Proceedings., 11th International
  • Conference_Location
    Genva
  • ISSN
    1063-7133
  • Print_ISBN
    0-8186-7793-7
  • Type

    conf

  • DOI
    10.1109/IPPS.1997.580901
  • Filename
    580901