DocumentCode
3120834
Title
Introducing map-reduce to high end computing
Author
Mackey, Grant ; Sehrish, Saba ; Bent, John ; Lopez, Julio ; Habib, Salman ; Wang, Jun
Author_Institution
Los Alamos Nat. Lab., Univ. of Central Florida, Los Alamos, NM
fYear
2008
fDate
17-17 Nov. 2008
Firstpage
1
Lastpage
6
Abstract
In this work we present an scientific application that has been given a Hadoop MapReduce implementation. We also discuss other scientific fields of supercomputing that could benefit from a MapReduce implementation. We recognize in this work that Hadoop has potential benefit for more applications than simply data mining, but that it is not a panacea for all data intensive applications. We provide an example of how the halo finding application, when applied to large astrophysics datasets, benefits from the model of the Hadoop architecture. The halo finding application uses a friends of friends algorithm to quickly cluster together large sets of particles to output files which a visualization software can interpret. The current implementation requires that large datasets be moved from storage to computation resources for every simulation of astronomy data. Our Hadoop implementation allows for an in-place halo finding application on the datasets, which removes the time consuming process of transferring data between resources.
Keywords
astronomy computing; data visualisation; Hadoop MapReduce implementation; Hadoop architecture; astronomy data; astrophysics datasets; computation resources; data intensive applications; high end computing; supercomputing scientific fields; visualization software; Benchmark testing; Current measurement; Engines; Instruments; Laboratories; Libraries; System performance; Time measurement; Timing; Utility programs;
fLanguage
English
Publisher
ieee
Conference_Titel
Petascale Data Storage Workshop, 2008. PDSW '08. 3rd
Conference_Location
Austin, TX
Print_ISBN
978-1-4244-4208-9
Type
conf
DOI
10.1109/PDSW.2008.4811889
Filename
4811889
Link To Document