Title :
User-directed exploration of mining space with multiple attributes
Author :
Perng, Chang-Shing ; Wang, Haixun ; Ma, Sheng ; Hellerstein, Joseph L.
Author_Institution :
IBM Thomas J. Watson Res. Center, Hawthorne, NY, USA
Abstract :
There has been a growing interest in mining frequent itemsets in relational data with multiple attributes. A key step in this approach is to select a set of attributes that group data into transactions and a separate set of attributes that labels data into items. Unsupervised and unrestricted mining, however is stymied by the combinatorial complexity and the quantity of patterns as the number of attributes grows. In this paper we focus on leveraging the semantics of the underlying data for mining frequent itemsets. For instance, there are usually taxonomies in the data schema and functional dependencies among the attributes. Domain knowledge and user preferences often have the potential to significantly reduce the exponentially growing mining space. These observations motivate the design of a user-directed data mining framework that allows such domain knowledge to guide the mining process and control the mining strategy. We show examples of tremendous reduction in computation by using domain knowledge in mining relational data with multiple attributes.
Keywords :
data mining; relational databases; transaction processing; combinatorial complexity; data grouping; data labelling; data schema; domain knowledge; exponentially growing mining space reduction; frequent itemset mining; functional dependencies; multiple attributes; relational data; semantics; taxonomies; transactions; unsupervised unrestricted mining; user directed data mining framework; user preferences; user-directed mining space exploration; Complex networks; Computational complexity; Data mining; Itemsets; Marine vehicles; Pattern analysis; Production; Space exploration; Taxonomy;
Conference_Titel :
Data Mining, 2002. ICDM 2003. Proceedings. 2002 IEEE International Conference on
Print_ISBN :
0-7695-1754-4
DOI :
10.1109/ICDM.2002.1183931