Author :
Antos, J. ; Babik, M. ; Benjamin, D. ; Cabrera, S. ; Chan, A.W. ; Chen, Y.C. ; Coca, M. ; Cooper, B. ; Farrington, S. ; Genser, K. ; Hatakeyama, K. ; Hou, S. ; Hsieh, T.L. ; Jayatilaka, B. ; Jun, S.Y. ; Kotwal, A.V. ; Kraan, A.C. ; Lysak, R. ; Mandrichenk
Abstract :
The data processing model for the CDF experiment is described. Data processing reconstructs events from parallel data streams taken with different combinations of physics event triggers and further splits the events into datasets of specialised physics interests. The design of the processing control system makes strict requirements on bookkeeping records, which trace the status of data files and event contents during processing and storage. The computing architecture was updated to meet the mass data flow of the Run II data collection, recently upgraded to a maximum rate of 40 MByte/sec. The data processing facility consists of a large cluster of Linux computers with data movement managed by the CDF data handling system to a multi-petaByte Enstore tape library. The latest processing cycle has achieved a stable speed of 35 MByte/sec (3 TByte/day). It can be readily scaled by increasing CPU and data-handling capacity as required
Keywords :
control system CAD; data acquisition; data handling; grid computing; particle calorimetry; physics computing; position sensitive particle detectors; CDF; CPU; GRID; Linux computers; Run II data collection; acquisition system; bookkeeping records; calorimetry; computer system; computing architecture; data files; data movement; data processing control system; data-handling capacity; datasets; mass data flow; multipetaByte Enstore tape library; parallel data streams; particle tracking; physics event triggers; Computer architecture; Control systems; Data flow computing; Data handling; Data processing; Libraries; Linux; Physics; Process control; Process design; Computer system; GRID; data processing;