DocumentCode :
2773354
Title :
A Cost-Effective LSH Filter for Fast Pairwise Mining
Author :
Zhao, Gang ; Xiong, Yun ; Cao, Longbing ; Luo, Dan ; Su, Xuchun ; Zhu, Yangyong
Author_Institution :
Sch. of Comput. Sci., Fudan Univ., Shanghai, China
fYear :
2009
fDate :
6-9 Dec. 2009
Firstpage :
1088
Lastpage :
1093
Abstract :
The pairwise mining problem is to discover pairwise objects having measures greater than the user-specified minimum threshold from a collection of objects. It is essential in a large variety of database and data-mining applications. Of late, there has been increasing interest in applying a Locality-Sensitive Hashing (LSH) scheme for pairwise mining. LSH-type methods have shown themselves to be simply implementable and capable of achieving significant performance gain in running time over most exact methods. However, the present LSH-type methods still suffer from some bottlenecks, such as ¿the curse of threshold¿. In this paper, we proposed a novel LSH-based method, namely Cost-effective LSH filter (Ce-LSH for short), for pairwise mining. Compared with previous LSH-type methods, it uses a lower fixed number of LSH functions and is thus more cost-effective. Substantial experiments evidence that our method gives significant improvement in running time over existing LSH-type methods and some recently reported method based on upper-bound. Experimental results also indicate that it scales well even for a relatively low minimum threshold and for a fairly small miss ratio.
Keywords :
data mining; database management systems; cost-effective LSH filter; curse of threshold; data mining; database; locality-sensitive hashing; pairwise mining; Australia; Computer science; Data engineering; Data mining; Data models; Databases; Information filtering; Information filters; Information technology; Performance gain; locality hashing function; pairwise mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2009. ICDM '09. Ninth IEEE International Conference on
Conference_Location :
Miami, FL
ISSN :
1550-4786
Print_ISBN :
978-1-4244-5242-2
Electronic_ISBN :
1550-4786
Type :
conf
DOI :
10.1109/ICDM.2009.112
Filename :
5360361
Link To Document :
بازگشت