• DocumentCode
    2384252
  • Title

    An Efficient, Nonintrusive, Log-Based I/O Mechanism for Scientific Simulations on Clusters

  • Author

    Mitra, Soumyadeb ; Sinha, Rishi Rakesh ; Winslett, Marianne ; Jiao, Xiangmin

  • Author_Institution
    Dept. of Comput. Sci., Illinois Univ., Urbana-Champaign, IL
  • fYear
    2005
  • fDate
    Sept. 2005
  • Firstpage
    1
  • Lastpage
    10
  • Abstract
    Scientific simulations are often very I/O intensive, requiring high I/O bandwidth to store the data generated by the simulation. Traditional supercomputers have specialized I/O systems with multiple I/O nodes and specialized interconnects to handle such high I/O loads. However, with the increased availability of inexpensive clusters of workstations, more and more simulations are now run on clusters. Unfortunately, cluster supercomputers are usually not very well equipped for I/O, making I/O a serious bottleneck for such applications. To address this problem, we propose log-based I/O (LBIO), an approach that can substantially increase the I/O performance of simulations on clusters by utilizing free space on the cluster´s local disks to stage data on its way to remote storage. LBIO uses local disks to create a log of all I/O calls, and uses a background thread to replay the log at the rate that best utilizes the server and network resources. LBIO is implemented as an easy-to-use, non-intrusive library - a user can turn on LBIO by adding a single initialization call to the simulation code. LBIO also works with existing scientific I/O libraries like HDF, as well as collective libraries like ROMIO. Our performance studies on microbenchmarks and a real-world scientific simulation code show that LBIO can provide upto 35% improvement in I/O performance for raw I/O and over 50% for I/O through libraries like ROMIO or HDF
  • Keywords
    parallel machines; workstation clusters; data storage; high I/O bandwidth; log-based I/O mechanism; network resources; nonintrusive library; scientific simulations; server resources; supercomputers; workstation clusters; Aggregates; Computational modeling; Computer simulation; File systems; Libraries; Neck; Rockets; Supercomputers; Throughput; Yarn;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster Computing, 2005. IEEE International
  • Conference_Location
    Burlington, MA
  • ISSN
    1552-5244
  • Print_ISBN
    0-7803-9486-0
  • Electronic_ISBN
    1552-5244
  • Type

    conf

  • DOI
    10.1109/CLUSTR.2005.347041
  • Filename
    4154084