DocumentCode :
1872387
Title :
Extracting the textual and temporal structure of supercomputing logs
Author :
Jain, Sourabh ; Singh, Inderpreet ; Chandra, Abhishek ; Zhang, Zhi-Li ; Bronevetsky, Greg
Author_Institution :
Dept. of Comput. Sci., Univ. of Minnesota-Twin Cities, Minneapolis, MN, USA
fYear :
2009
fDate :
16-19 Dec. 2009
Firstpage :
254
Lastpage :
263
Abstract :
Supercomputers are prone to frequent faults that adversely affect their performance, reliability and functionality. System logs collected on these systems are a valuable resource of information about their operational status and health. However, their massive size, complexity, and lack of standard format makes it difficult to automatically extract information that can be used to improve system management. In this work we propose a novel method to succinctly represent the contents of supercomputing logs, by using textual clustering to automatically find the syntactic structures of log messages. This information is used to automatically classify messages into semantic groups via an online clustering algorithm. Further, we describe a methodology for using the temporal proximity between groups of log messages to identify correlated events in the system. We apply our proposed methods to two large, publicly available supercomputing logs and show that our technique features nearly perfect accuracy for online log-classification and extracts meaningful structural and temporal message patterns that can be used to improve the accuracy of other log analysis techniques.
Keywords :
information filtering; parallel machines; pattern classification; pattern clustering; text analysis; information extraction; log analysis techniques; log message syntactic structures; online clustering algorithm; supercomputers; supercomputing log temporal structure; system logs; system management; temporal message patterns; temporal proximity; textual clustering; textual extraction; Cities and towns; Clustering algorithms; Computer science; Data mining; Fault diagnosis; Laboratories; Large-scale systems; Pattern analysis; Storms; Supercomputers;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing (HiPC), 2009 International Conference on
Conference_Location :
Kochi
Print_ISBN :
978-1-4244-4922-4
Electronic_ISBN :
978-1-4244-4921-7
Type :
conf
DOI :
10.1109/HIPC.2009.5433202
Filename :
5433202
Link To Document :
بازگشت