Title of article :
Cosine interesting pattern discovery
Author/Authors :
Junjie Wu، نويسنده , , Shiwei Zhu، نويسنده , , Hongfu Liu، نويسنده , , Guoping Xia، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2012
Pages :
20
From page :
176
To page :
195
Abstract :
Recent years have witnessed an increasing interest in computing cosine similarity between high-dimensional documents, transactions, and gene sequences, etc. Most previous studies limited their scope to the pairs of items, which cannot be adapted to the multi-itemset cases. Therefore, from a frequent pattern mining perspective, there exists still a critical need for discovering interesting patterns whose cosine similarity values are above some given thresholds. However, the knottiest point of this problem is, the cosine similarity has no anti-monotone property. To meet this challenge, we propose the notions of conditional anti-monotone property and Support-Ascending Set Enumeration Tree (SA-SET). We prove that the cosine similarity has the conditional anti-monotone property and therefore can be used for the interesting pattern mining if the itemset traversal sequence is defined by the SA-SET. We also identify the anti-monotone property of an upper bound of the cosine similarity, which can be used in further pruning the candidate itemsets. An Apriori-like algorithm called CosMiner is then put forward to mine the cosine interesting patterns from large-scale multi-item databases. Experimental results show that CosMiner can efficiently identify interesting patterns using the conditional anti-monotone property of the cosine similarity and the anti-monotone property of its upper bound, even at extremely low levels of support.
Keywords :
Set enumeration tree , Cosine similarity , Interestingness measure , Conditional anti-monotone property , Correlation computation
Journal title :
Information Sciences
Serial Year :
2012
Journal title :
Information Sciences
Record number :
1214859
Link To Document :
بازگشت