• DocumentCode
    3717356
  • Title

    A Hadoop-based visualization and diagnosis framework for earth science data

  • Author

    Shujia Zhou;Xi Yang;Xiaowen Li;Toshihisa Matsui;Si Liu;Xian-He Sun;Weikuo Tao

  • Author_Institution
    Northrop Grumman Information Technology, McLean, VA 22102
  • fYear
    2015
  • Firstpage
    1972
  • Lastpage
    1977
  • Abstract
    With rapidly growing computing power, ultra high-resolution Earth science simulations with a long period of time are feasible. However, it is still very challenging to distribute and analyze a huge amount of simulation results, which could be over 100TB. One key reason is that typical Earth science data are represented in NetCDF, which is not supported by the popular and powerful Hadoop Distribute File System (HDFS) and consequently cannot be analyzed with tools based on HDFS. In this paper, we propose a Hadoop-based visualization and diagnosis framework for visualizing and analyzing Earth science data. It has a data model to transform data from the format of NetCDF to CSV (Comma Separated Value) that is supported by HDFS. With this model, data can be processed with the operations such as maximize, sum, and subset through HIVE and Cloudera Impala and, therefore, typical diagnoses can be performed. In addition, the framework has a technique to visualize and diagnose HDFS-resident data with the popular visualization and diagnosis tool, IDL. To speed up this process, a concurrent reader is developed to obtain HDFS-resident data. Moreover, a dynamic reader to transfer data from a parallel file system (PFS) to HDFS is developed to efficiently visualize and diagnose PFS-resident data. The cloud resolve mode simulations are used for testing and evaluating this framework.
  • Keywords
    "Data visualization","Data models","Clouds","Geoscience","Computational modeling","Customer relationship management","Instruction sets"
  • Publisher
    ieee
  • Conference_Titel
    Big Data (Big Data), 2015 IEEE International Conference on
  • Type

    conf

  • DOI
    10.1109/BigData.2015.7363977
  • Filename
    7363977