DocumentCode :
792723
Title :
Performance of the Distributed Central Analysis in BaBar
Author :
Khan, Ajmal ; Mommsen, Remigius K. ; Gradl, W. ; Fritsch, Marco ; Petzold, A. ; Roethel, W. ; Smith, David A.
Author_Institution :
Sch. of Eng. & Design, Brunel Univ.
Volume :
53
Issue :
5
fYear :
2006
Firstpage :
2876
Lastpage :
2880
Abstract :
The total dataset produced by the BaBar experiment at the Stanford Linear Accelerator Center (SLAC) currently comprises roughly 3times10 9 data events and an equal amount of simulated events, corresponding to 23 Tbytes of real data and 51 Tbytes simulated events. Since individual analyses typically select a very small fraction of all events, it would be extremely inefficient if each analysis had to process the full dataset. A first, centrally managed analysis step is therefore a common pre-selection (´skimming´) of all data according to very loose, inclusive criteria to facilitate data access for later analysis. Usually, there are common selection criteria for several analysis. However, they may change over time, e.g., when new analyses are developed. Currently, O(100) such pre-selection streams (´skims´) are defined. In order to provide timely access to newly created or modified skims, it is necessary to process the complete dataset several times a year. Additionally, newly taken or simulated data has to be skimmed as it becomes available. The system currently deployed for skim production is using 1800 CPUs distributed over three production sites. It was possible to process the complete dataset within about 3.5 months. We report on the stability and the performance of the system
Keywords :
data handling; database management systems; high energy physics instrumentation computing; particle detectors; physics computing; BaBar; CPU; SLAC; Stanford Linear Accelerator Center; data events; data handling; data management; data processing; distributed central analysis; distributed computing; preselection streams; simulated events; skim production; Analytical models; Data processing; Detectors; Discrete event simulation; Distributed computing; Linear accelerators; Performance analysis; Physics; Production systems; Stability; Data handling; data management; data processing; distributed computing;
fLanguage :
English
Journal_Title :
Nuclear Science, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9499
Type :
jour
DOI :
10.1109/TNS.2006.881737
Filename :
1710288
Link To Document :
بازگشت