DocumentCode :
3462333
Title :
Using MapReduce for High Energy Physics Data Analysis
Author :
Glaser, Fabian ; Neukirchen, Helmut ; Rings, Thomas ; Grabowski, Jens
Author_Institution :
Inst. of Comput. Sci., Univ. of Gottingen, Gottingen, Germany
fYear :
2013
fDate :
3-5 Dec. 2013
Firstpage :
1271
Lastpage :
1278
Abstract :
At the Large Hadron Collider (LHC) High Energy Physics (HEP) experiment at CERN, 15 PB of raw data is recorded per year. As it was considered inconvenient to store, access and process this data using the traditional hardware and software tools, this data gets reduced to 10-200 TB per year. This paper investigates the applicability of the MapReduce paradigm for analyzing HEP data. In a case study, a sample HEP analysis that makes use of the HEP analysis framework ROOT has been re-implemented using the MapReduce implementation Apache Hadoop. In addition, a Hadoop input format has been developed that takes storage locality of the ROOT file format into account. This approach was evaluated in a cloud computing environment and compared to data analysis with the Parallel ROOT Facility (PROOF).
Keywords :
data analysis; high energy physics instrumentation computing; Apache Hadoop input format; CERN; HEP data analysis framework ROOT; HEP experiment; LHC; Large Hadron Collider; MapReduce; PROOF; Parallel ROOT Facility; ROOT file format; cloud computing environment; hardware tools; high energy physics data analysis; software tools; storage locality; Cloud computing; Computer architecture; Data analysis; Distributed databases; Large Hadron Collider; Measurement; Physics; Cloud computing; Hadoop; High Energy Physics; Input format; MapReduce; PROOF; ROOT;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Science and Engineering (CSE), 2013 IEEE 16th International Conference on
Conference_Location :
Sydney, NSW
Type :
conf
DOI :
10.1109/CSE.2013.189
Filename :
6755371
Link To Document :
بازگشت