DocumentCode :
748210
Title :
Mining mutually dependent patterns for system management
Author :
Ma, Sheng ; Hellerstein, Joseph L.
Author_Institution :
IBM Thomas J. Watson Res. Center, Hawthorne, NY, USA
Volume :
20
Issue :
4
fYear :
2002
fDate :
5/1/2002 12:00:00 AM
Firstpage :
726
Lastpage :
735
Abstract :
In some domains, such as isolating problems in computer networks and discovering stock market irregularities, there is more interest in patterns consisting of infrequent, but highly correlated items rather than patterns that occur frequently (as defined by minsup, the minimum support level). We describe m-pattern, a new pattern that is defined in terms of minp, the minimum probability of mutual dependence of items in the pattern. We show that all infrequent m-pattern can be discovered by an efficient algorithm that makes use of: (1) a linear algorithm to qualify an m-pattern; (2) an effective technique for candidate pruning based on a necessary condition for the presence of an m-pattern; and (3) a level-wise search for m-pattern discovery (which is possible because m-patterns are downward closed). Further, we consider frequent m-patterns, which are defined in terms of both minp and minsup. Using synthetic data, we study the scalability of our algorithm. Then, we apply our algorithm to data from a production computer network both to show the m-patterns present and to contrast with frequent patterns. We show that when minp=0, our algorithm is equivalent to finding frequent patterns. However, with a larger minp, our algorithm yields a modest number of highly correlated items, which makes it possible to mine for infrequent but highly correlated itemsets. To date, many actionable m-patterns have been discovered in production systems
Keywords :
computer network management; data mining; performance evaluation; probability; production engineering computing; algorithm scalability; correlated items; efficient algorithm; frequent m-patterns; frequent patterns; isolating problems; level-wise search; linear algorithm; m-pattern discovery; minimum probability; minimum support level; minp; minsup; mutually dependent patterns mining; necessary condition; production computer network; stock market irregularities; synthetic data; system management; Association rules; Computer network management; Computer networks; Data mining; Filtering; Itemsets; Production systems; Scalability; Stock markets; Vents;
fLanguage :
English
Journal_Title :
Selected Areas in Communications, IEEE Journal on
Publisher :
ieee
ISSN :
0733-8716
Type :
jour
DOI :
10.1109/JSAC.2002.1003039
Filename :
1003039
Link To Document :
بازگشت