DocumentCode :
2779726
Title :
Novel algorithm for mining high utility itemsets
Author :
Shankar, Subramaniam ; Purusothaman, T. ; Jayanthi, Srinivas
Author_Institution :
SKCET, Coimbatore
fYear :
2008
fDate :
18-20 Dec. 2008
Firstpage :
1
Lastpage :
6
Abstract :
One of the important issues in data mining is the interestingness problem. Typically, in a data mining process, the number of patterns discovered can easily exceed the capabilities of a human user to identify interesting results. To address this problem, utility measures have been used to reduce the patterns prior to presenting them to the user. The fundamental idea behind mining frequent itemsets is that only item sets with high frequency are of interest to users. However, the practical usefulness of frequent itemsets is limited by the significance of the discovered itemsets. A frequent itemset only reflects the statistical correlation between items, and it does not reflect the semantic significance of the items. In this paper, we are using a utility based itemset mining approach to overcome this limitation. Utility based data mining is a new research area interested in all types of utility factors in data mining processes and targeted at incorporating utility considerations in data mining tasks. High utility itemset mining is a research area of utility based data mining, aimed at finding itemsets that contribute high utility. This paper presents a novel algorithm fast utility mining (FUM) which finds all high utility itemsets within the given utility constraint threshold. It is faster and simpler than the original Umining algorithm. The experimental evaluation on artificial datasets show that our algorithm executes faster than Umining algorithm, when more itemsets are identified as high utility itemsets and when the number of distinct items in the database increases. The proposed FUM algorithm scales well as the size of the transaction database increases with regard to the number of distinct items available.
Keywords :
data mining; statistical analysis; fast utility mining algorithm; statistical correlation; transaction database; utility itemset; Association rules; Costs; Data mining; Frequency; Humans; Itemsets; Marketing and sales; Transaction databases; Upper bound;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computing, Communication and Networking, 2008. ICCCn 2008. International Conference on
Conference_Location :
St. Thomas, VI
Print_ISBN :
978-1-4244-3594-4
Electronic_ISBN :
978-1-4244-3595-1
Type :
conf
DOI :
10.1109/ICCCNET.2008.4787766
Filename :
4787766
Link To Document :
بازگشت