Abstract :
In this paper, the problem of constraint-based pattern discovery is investigated. By allowing more user-specified constraints other than traditional rule measurements, e.g., minimum support and confidence, research work on this topic endeavor to reflect real interest of analysts and relief them from the overabundance of rules. Surprisingly very little research has been conducted to deal with multiple types of constraints. In our previous work, we have studied this problem, specifically focusing on three different types of constraints, including item constraint, aggregation constraint, and cardinality constraint. And an efficient apriori-like algorithm, called MCFP, is proposed. In this paper, we propose a new algorithm called MCFPTree, which is based on the FP-tree structure and thus does not suffer from the problem of candidate itemsets generation. Experimental results show that our MCFPTree algorithm is significantly faster than MCFP and an intuitive method FP-Growth+, i.e., post processing the frequent patterns generated by FP-Growth, against user-specified constraints.
Keywords :
data mining; trees (mathematics); FP-tree-based algorithm; MCFPTree; aggregation constraint; apriori-like algorithm; cardinality constraint; item constraint; multiconstrained patterns discovery; Algorithm design and analysis; Association rules; Communication system software; Competitive intelligence; Computer science; Data mining; Itemsets; Software algorithms; Software systems; Terminology;