Input/output APIs and data organization for high performance scientific computing

Author

Lofstead, Jay ; Zheng, Fang ; Klasky, Scott ; Schwan, Karsten

Author_Institution

Coll. of Comput., Georgia Inst. of Technol., Atlanta, GA

fYear

2008

fDate

17-17 Nov. 2008

Firstpage

Lastpage

Abstract

Scientific Data Management has become essential to the productivity of scientists using ever larger machines and running applications that produce ever more data. There are several specific issues when running on petascale (and beyond) machines. One is the need for massively parallel data output, which in part, depends on the data formats and semantics being used. Here, the inhibition of parallelism by file system notions of strict and immediate consistency can be addressed with ldrdelayed data consistencypsila methods. Such methods can also be used to remove the runtime coordination steps required for immediate consistency from machine resources like Bluegene´s separate networks for barrier calls and its dedicated IO nodes, thereby freeing them to instead, perform alternate tasks that enhance data output performance and/or richness. Second, once data is generated, it is important to be able to efficiently access it, which implies the need for rapid data characterization and indexing. This can be achieved by adding small amounts of metadata to the output process, thereby permitting scientists to quickly make informed decisions about which files to process from large-scale science runs. Third, failure probabilities increase with an increasing number of nodes, which suggests the need for organizing output data to be resilient to failures in which the output from a single or from a small number of nodes is lost or corrupted. This paper demonstrates the utility of using delayed consistency methods for the process of data output from the compute nodes of petascale machines. It also demonstrates the advantages derived from resilient data organization coupled with lightweight methods for data indexing. An implementation of these techniques is realized in ADIOS, the Adaptable IO System, and its BP intermediate file format. The implementation is designed to be compatible with existing, well-known file formats like HDF-5 and NetCDF, thereby permitting end users to exploit th- - e rich tool chains for these formats. Initial performance evaluations of the approach exhibit substantial performance advantages over using native parallel HDF-5 in the Chimera supernova code.

Keywords

application program interfaces; database indexing; input-output programs; meta data; natural sciences computing; parallel programming; probability; system recovery; data indexing; data organization; failure probability; file system; high performance scientific computing; input-output API; meta data; petascale machine; scientific data management; Character generation; File systems; Indexing; Large-scale systems; Organizing; Parallel processing; Petascale computing; Productivity; Runtime; Scientific computing;

fLanguage

English

Publisher

ieee

Conference_Titel

Petascale Data Storage Workshop, 2008. PDSW '08. 3rd

Conference_Location

Austin, TX

Print_ISBN

978-1-4244-4208-9

Type

conf

DOI

10.1109/PDSW.2008.4811881

Filename

4811881

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=3120687