DocumentCode :
3532543
Title :
Support driven opportunistic aggregation for generalized itemset extraction
Author :
Baralis, Elena ; Cagliero, Luca ; Cerquitelli, Tania ; D´Elia, Vincenzo ; Garza, Paolo
Author_Institution :
Dipt. di Autom. e Inf., Politec. di Torino, Turin, Italy
fYear :
2010
fDate :
7-9 July 2010
Firstpage :
102
Lastpage :
107
Abstract :
Association rule extraction is a widely used exploratory technique which has been exploited in different contexts (e.g., biological data, medical images). However, association rule extraction, driven by support and confidence constraints, entails (i) generating a huge number of rules which are difficult to analyze, or (ii) pruning rare itemsets, even if their hidden knowledge might be relevant. To address the above issues, this paper presents a novel frequent itemset mining algorithm, called GENIO (GENeralized Itemset DiscOverer), to analyze correlation among data by means of generalized itemsets, which provide a powerful tool to efficiently extract hidden knowledge, discarded by previous approaches. The proposed technique exploits a (user provided) taxonomy to drive the pruning phase of the extraction process. Instead of extracting itemsets for all levels of the taxonomy and post-pruning them, the GenIO algorithm performs a support driven opportunistic aggregation of itemsets. Generalized itemsets are extracted only if itemsets at a lower level in the taxonomy are below the support threshold. Experiments performed in the network traffic domain show the efficiency and the effectiveness of the proposed algorithm.
Keywords :
data analysis; data mining; GENIO algorithm; association rule extraction; confidence constraints; generalized itemset discoverer algorithm; generalized itemset extraction; itemset mining algorithm; support constraints; support driven opportunistic aggregation; Algorithm design and analysis; Association rules; Biomedical imaging; Data analysis; Data mining; Frequency; Itemsets; Probability; Taxonomy; Telecommunication traffic; Generalized itemset mining; data mining techniques; knowledge discovery; network traffic data analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Systems (IS), 2010 5th IEEE International Conference
Conference_Location :
London
Print_ISBN :
978-1-4244-5163-0
Electronic_ISBN :
978-1-4244-5164-7
Type :
conf
DOI :
10.1109/IS.2010.5548348
Filename :
5548348
Link To Document :
بازگشت