• DocumentCode
    2430260
  • Title

    Building a large scale climate data system in support of HPC environment

  • Author

    Wang, Feiyi ; Harney, John ; Shipman, Galen ; Williams, Dean ; Cinquini, Luca

  • Author_Institution
    Oak Ridge Nat. Lab., Oak Ridge, TN, USA
  • fYear
    2011
  • fDate
    19-21 Oct. 2011
  • Firstpage
    380
  • Lastpage
    385
  • Abstract
    The Earth System Grid Federation (ESG) is a large scale, multi-institutional, interdisciplinary project that aims to provide climate scientists and impact policy makers worldwide a web-based and client-based platform to publish, disseminate, compare and analyze ever increasing climate related data. This paper describes our practical experiences on the design, development and operation of such a system. In particular, we focus on the support of the data lifecycle from a high performance computing (HPC) perspective that is critical to the end-to-end scientific discovery process. We discuss three subjects that interconnect the consumer and producer of scientific datasets: (1) the motivations, complexities and solutions of deep storage access and sharing in a tightly controlled environment; (2) the importance of scalable and flexible data publication/population; and (3) high performance indexing and search of data with geospatial properties. These perceived corner issues collectively contributed to the overall user experience and proved to be as important as any other architectural design considerations. Although the requirements and challenges are rooted and discussed from a climate science domain context, we believe the architectural problems, ideas and solutions discussed in this paper are generally useful and applicable in a larger scope.
  • Keywords
    Internet; climatology; geophysics computing; indexing; information dissemination; information retrieval; storage management; ESG; HPC environment; Web-based platform; architectural design considerations; architectural problems; client-based platform; climate related data; climate science domain context; climate scientists; data lifecycle; data population; deep storage access; earth system grid federation; end-to-end scientific discovery process; flexible data publication; geospatial property; high performance computing; high performance indexing; large scale climate data system; multiinstitutional interdisciplinary project; policy makers; scalable data publication; scientific datasets; Access control; Catalogs; Context; Data models; Distributed databases; Logic gates; Meteorology;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Next Generation Web Services Practices (NWeSP), 2011 7th International Conference on
  • Conference_Location
    Salamanca
  • Print_ISBN
    978-1-4577-1125-1
  • Type

    conf

  • DOI
    10.1109/NWeSP.2011.6088209
  • Filename
    6088209