• DocumentCode
    3120720
  • Title

    Zest Checkpoint storage system for large supercomputers

  • Author

    Nowoczynski, Paul ; Stone, Nathan ; Yanovich, Jared ; Sommerfield, Jason

  • Author_Institution
    Pittsburgh Supercomput. Center, Pittsburgh, PA
  • fYear
    2008
  • fDate
    17-17 Nov. 2008
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    The PSC has developed a prototype distributed file system infrastructure that vastly accelerates aggregated write bandwidth on large compute platforms. Write bandwidth, more than read bandwidth, is the dominant bottleneck in HPC I/O scenarios due to writing checkpoint data, visualization data and post-processing (multi-stage) data. We have prototyped a scalable solution that will be directly applicable to future petascale compute platforms having of order 10^6 cores. Our design emphasizes high-efficiency scalability, low-cost commodity components, lightweight software layers, end-to-end parallelism, client-side caching and software parity, and a unique model of load-balancing outgoing I/O onto high-speed intermediate storage followed by asynchronous reconstruction to a 3rd-party parallel file system.
  • Keywords
    checkpointing; data visualisation; input-output programs; mainframes; parallel processing; program verification; resource allocation; HPC I-O scenarios; asynchronous reconstruction; checkpoint storage system; client-side caching; data checkpoint; data visualization; end-to-end parallelism; high-speed intermediate storage; load-balancing; parallel file system; petascale compute platforms; post-processing data; prototype distributed file system infrastructure; software layers; software parity; Acceleration; Bandwidth; Data visualization; Distributed computing; File systems; Petascale computing; Prototypes; Software prototyping; Supercomputers; Writing; Client-side Raid; High-performance commodity storage; Parallel Application Checkpoint; Parallel I/O; Petascale Storage; Terabytes per second; log-structured filesystems;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Petascale Data Storage Workshop, 2008. PDSW '08. 3rd
  • Conference_Location
    Austin, TX
  • Print_ISBN
    978-1-4244-4208-9
  • Type

    conf

  • DOI
    10.1109/PDSW.2008.4811883
  • Filename
    4811883