DocumentCode :
2456616
Title :
Discovering Threshold-based Frequent Closed Itemsets over Probabilistic Data
Author :
Tong, Yongxin ; Chen, Lei ; Ding, Bolin
Author_Institution :
Dept. of Comput. Sci. & Eng., Hong Kong Univ. of Sci. & Eng., Hong Kong, China
fYear :
2012
fDate :
1-5 April 2012
Firstpage :
270
Lastpage :
281
Abstract :
In recent years, many new applications, such as sensor network monitoring and moving object search, show a growing amount of importance of uncertain data management and mining. In this paper, we study the problem of discovering threshold-based frequent closed item sets over probabilistic data. Frequent item set mining over probabilistic database has attracted much attention recently. However, existing solutions may lead an exponential number of results due to the downward closure property over probabilistic data. Moreover, it is hard to directly extend the successful experiences from mining exact data to a probabilistic environment due to the inherent uncertainty of data. Thus, in order to obtain a reasonable result set with small size, we study discovering frequent closed item sets over probabilistic data. We prove that even a sub-problem of this problem, computing the frequent closed probability of an item set, is #P-Hard. Therefore, we develop an efficient mining algorithm based on depth-first search strategy to obtain all probabilistic frequent closed item sets. To reduce the search space and avoid redundant computation, we further design several probabilistic pruning and bounding techniques. Finally, we verify the effectiveness and efficiency of the proposed methods through extensive experiments.
Keywords :
data mining; probability; #P-hard; bounding technique; data management; data mining; depth-first search strategy; moving object search; probabilistic data; probabilistic frequent closed item sets; probabilistic pruning technique; sensor network monitoring; threshold-based frequent closed itemsets; Algorithm design and analysis; Data mining; Itemsets; Probabilistic logic; Search problems;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering (ICDE), 2012 IEEE 28th International Conference on
Conference_Location :
Washington, DC
ISSN :
1063-6382
Print_ISBN :
978-1-4673-0042-1
Type :
conf
DOI :
10.1109/ICDE.2012.51
Filename :
6228090
Link To Document :
بازگشت