DocumentCode :
589115
Title :
Thorough Analysis of Log Data with Dependency Rules: Practical Solutions and Theoretical Challenges
Author :
Hamalainen, W.
Author_Institution :
Dept. of Comput., Univ. of Eastern Finland, Joensuu, Finland
fYear :
2012
fDate :
10-10 Dec. 2012
Firstpage :
579
Lastpage :
586
Abstract :
In this paper, we present our vision how statistical dependency rule mining could be applied to a thorough analysis of log data. Dependency rules are especially attractive as a first step mining method due to their efficient algorithms and globally optimal results. The major drawback is a rather specific form of the dependencies, which requires binary data. It is not always clear how heterogeneous real world data should be binarized and how the tools should be used so that all interesting dependencies would be caught. We give an overview of typical problems when analyzing log data. The three major problems are: 1) How to balance between groups and individuals such that both general regularities and individual peculiarities can be found? 2) How to handle numerical and periodic variables? 3) How to extract features from the intrinsic dimensions of log data? For each problem, we give practical solutions in the form of preprocessing techniques and constraints which can be used with the existing tools. We also point out important research problems and algorithmic challenges, which would require further research.
Keywords :
constraint handling; data analysis; data mining; feature extraction; numerical analysis; statistical analysis; binary data; feature extraction; first step mining method; heterogeneous real world data; intrinsic log data dimensions; numerical variables; periodic variables; preprocessing techniques; statistical dependency rule mining; thorough log data analysis; Algorithm design and analysis; Automata; Cows; Data mining; Feature extraction; Feeds; Redundancy; Dependency rule; discretization; hierarchical variable; intrinsic dimensionality; log data; numerical variable; preprocessing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining Workshops (ICDMW), 2012 IEEE 12th International Conference on
Conference_Location :
Brussels
Print_ISBN :
978-1-4673-5164-5
Type :
conf
DOI :
10.1109/ICDMW.2012.97
Filename :
6406404
Link To Document :
بازگشت