Abstract :
Summary form only given: Since scalability is one of the major challenges for advanced HPC systems in the post-petascale and exascale era, innovative integrated technology designs are needed for new architecture as well as associated software stacks. We need to explore the capability of cpu, accelerator, interconnection, I/O storage system, and till whole system. This talk will discuss the way of scalability-centric HPC system hardware and software design related to the computation, communication, data procession, and fault tolerance. The experiences on the design and implementation of Tianhe systems will also be given. Furthermore, some investigations on architecture and software design for the next generation HPC system will be presented. In general, a co-design approach should be followed throughout the research and development activities to deliver a whole system for scalable computing, to support the large-scale domain applications efficiently.
Keywords :
parallel processing; software architecture; software fault tolerance; CPU; I-O storage system; Tianhe systems; accelerator; architecture design; data procession; exascale era; fault tolerance; hardware design; innovative integrated technology designs; interconnection; large-scale domain applications; post-petascale era; scalability-centric HPC system design; scalable computing; software design; software stacks;