DocumentCode
2384252
Title
An Efficient, Nonintrusive, Log-Based I/O Mechanism for Scientific Simulations on Clusters
Author
Mitra, Soumyadeb ; Sinha, Rishi Rakesh ; Winslett, Marianne ; Jiao, Xiangmin
Author_Institution
Dept. of Comput. Sci., Illinois Univ., Urbana-Champaign, IL
fYear
2005
fDate
Sept. 2005
Firstpage
1
Lastpage
10
Abstract
Scientific simulations are often very I/O intensive, requiring high I/O bandwidth to store the data generated by the simulation. Traditional supercomputers have specialized I/O systems with multiple I/O nodes and specialized interconnects to handle such high I/O loads. However, with the increased availability of inexpensive clusters of workstations, more and more simulations are now run on clusters. Unfortunately, cluster supercomputers are usually not very well equipped for I/O, making I/O a serious bottleneck for such applications. To address this problem, we propose log-based I/O (LBIO), an approach that can substantially increase the I/O performance of simulations on clusters by utilizing free space on the cluster´s local disks to stage data on its way to remote storage. LBIO uses local disks to create a log of all I/O calls, and uses a background thread to replay the log at the rate that best utilizes the server and network resources. LBIO is implemented as an easy-to-use, non-intrusive library - a user can turn on LBIO by adding a single initialization call to the simulation code. LBIO also works with existing scientific I/O libraries like HDF, as well as collective libraries like ROMIO. Our performance studies on microbenchmarks and a real-world scientific simulation code show that LBIO can provide upto 35% improvement in I/O performance for raw I/O and over 50% for I/O through libraries like ROMIO or HDF
Keywords
parallel machines; workstation clusters; data storage; high I/O bandwidth; log-based I/O mechanism; network resources; nonintrusive library; scientific simulations; server resources; supercomputers; workstation clusters; Aggregates; Computational modeling; Computer simulation; File systems; Libraries; Neck; Rockets; Supercomputers; Throughput; Yarn;
fLanguage
English
Publisher
ieee
Conference_Titel
Cluster Computing, 2005. IEEE International
Conference_Location
Burlington, MA
ISSN
1552-5244
Print_ISBN
0-7803-9486-0
Electronic_ISBN
1552-5244
Type
conf
DOI
10.1109/CLUSTR.2005.347041
Filename
4154084
Link To Document