DocumentCode :
1804419
Title :
Mining Frequent Itemsets from Noisy Data
Author :
Narita, Kasuyo ; Kitagawa, Hiroyuki
Author_Institution :
University of Tsukuba, Japan
fYear :
2006
fDate :
2006
Abstract :
As we face huge amounts of varied information, data mining, which helps us discover hidden features or rules from voluminous data systematically, has become more important [3, 4, 6, 10]. However, real world data is often dirty, including noise such as missing or irrelevant values. The information mined from such noisy data may be incorrect. We model noisy data with probabilities, assuming that noise is mixed with data statistically. We also propose a way to find frequent itemsets [2] by estimating supports on noiseless data from noisy data. An algorithm using FP-tree [6, 10] is also presented to mine frequent itemsets efficiently.
Keywords :
Conferences; Data engineering; Data mining; Data models; Data privacy; Itemsets; Probability; Proposals; Systems engineering and theory; Transaction databases;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering Workshops, 2006. Proceedings. 22nd International Conference on
Conference_Location :
Atlanta, GA, USA
Print_ISBN :
0-7695-2571-7
Type :
conf
DOI :
10.1109/ICDEW.2006.90
Filename :
1623912
Link To Document :
بازگشت