DocumentCode
598574
Title
On distributed file tree walk of parallel file systems
Author
LaFon, J. ; Misra, Sudip ; Bringhurst, J.
Author_Institution
New Mexico State Univ., Las Cruces, NM, USA
fYear
2012
fDate
10-16 Nov. 2012
Firstpage
1
Lastpage
11
Abstract
Supercomputers generate vast amounts of data, typically organized into large directory hierarchies on parallel file systems. While the supercomputing applications are parallel, the tools used to process them requiring complete directory traversais, are typically serial. We present an algorithm framework and three fully distributed algorithms for traversing large parallel file systems, and performing file operations in parallel. The first algorithm introduces a randomized work-stealing scheduler; the second improves the first with proximity-awareness; and the third improves upon the second by using a hybrid approach. We have tested our implementation on Cielo, a 1.37 petaflop supercomputer at the Los Alamos National Laboratory and its 7 petabyte file system. Test results show that our algorithms execute orders of magnitude faster than state-of-the-art algorithms while achieving ideal load balancing and low communication cost. We present performance insights from the use of our algorithms in production systems at LANL, performing daily file system operations.
Keywords
mainframes; meta data; parallel algorithms; parallel databases; randomised algorithms; resource allocation; software performance evaluation; tree data structures; Cielo; Los Alamos National Laboratory; algorithm framework; complete directory traversals; computer speed 1.37 PFLOPS; distributed file tree; fully distributed algorithms; hybrid approach; ideal load balancing; large directory hierarchies; large parallel file systems; meta data; parallel applications; parallel file operations; petabyte file system; proximity-awareness; randomized work-stealing scheduler; supercomputers; supercomputing applications; Algorithm design and analysis; Heuristic algorithms; Parallel algorithms; Program processors; Servers; Synchronization; File Systems; Metadata; Parallel Algorithms;
fLanguage
English
Publisher
ieee
Conference_Titel
High Performance Computing, Networking, Storage and Analysis (SC), 2012 International Conference for
Conference_Location
Salt Lake City, UT
ISSN
2167-4329
Print_ISBN
978-1-4673-0805-2
Type
conf
DOI
10.1109/SC.2012.82
Filename
6468454
Link To Document