DocumentCode :
610353
Title :
Efficient search algorithm for SimRank
Author :
Fujiwara, Yuichiro ; Nakatsuji, M. ; Shiokawa, H. ; Onizuka, M.
Author_Institution :
NTT Software Innovation Center, Tokyo, Japan
fYear :
2013
fDate :
8-12 April 2013
Firstpage :
589
Lastpage :
600
Abstract :
Graphs are a fundamental data structure and have been employed to model objects as well as their relationships. The similarity of objects on the web (e.g., webpages, photos, music, micro-blogs, and social networking service users) is the key to identifying relevant objects in many recent applications. SimRank, proposed by Jeh and Widom, provides a good similarity score and has been successfully used in many applications such as web spam detection, collaborative tagging analysis, link prediction, and so on. SimRank computes similarities iteratively, and it needs O(N4T) time and O(N2) space for similarity computation where N and T are the number of nodes and iterations, respectively. Unfortunately, this iterative approach is computationally expensive. The goal of this work is to process top-k search and range search efficiently for a given node. Our solution, SimMat, is based on two ideas: (1) It computes the approximate similarity of a selected node pair efficiently in non-iterative style based on the Sylvester equation, and (2) It prunes unnecessary approximate similarity computations when searching for the high similarity nodes by exploiting estimations based on the Cauchy-Schwarz inequality. These two ideas reduce the time and space complexities of the proposed approach to O(Nn) where n is the target rank of the low-rank approximation (n ≪ N in practice). Our experiments show that our approach is much faster, by several orders of magnitude, than previous approaches in finding the high similarity nodes.
Keywords :
approximation theory; computational complexity; graph theory; search problems; Cauchy-Schwarz inequality; SimMat; SimRank; Sylvester equation; Web spam detection; Webpage; collaborative tagging analysis; data structure; graph; link prediction; low-rank approximation; microblog; music; noniterative style; object similarity; photo; range search; search algorithm; similarity computation; similarity node; similarity score; social networking service user; space complexity; time complexity; top-k search; Approximation methods; Eigenvalues and eigenfunctions; Equations; Iterative methods; Mathematical model; Social network services; Vectors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering (ICDE), 2013 IEEE 29th International Conference on
Conference_Location :
Brisbane, QLD
ISSN :
1063-6382
Print_ISBN :
978-1-4673-4909-3
Electronic_ISBN :
1063-6382
Type :
conf
DOI :
10.1109/ICDE.2013.6544858
Filename :
6544858
Link To Document :
بازگشت