مرکز منطقه ای اطلاع رساني علوم و فناوري

DocumentCode :

610353

Title :

Efficient search algorithm for SimRank

Author :

Fujiwara, Yuichiro ; Nakatsuji, M. ; Shiokawa, H. ; Onizuka, M.

Author_Institution :

NTT Software Innovation Center, Tokyo, Japan

fYear :

2013

fDate :

8-12 April 2013

Firstpage :

589

Lastpage :

600

Abstract :

Graphs are a fundamental data structure and have been employed to model objects as well as their relationships. The similarity of objects on the web (e.g., webpages, photos, music, micro-blogs, and social networking service users) is the key to identifying relevant objects in many recent applications. SimRank, proposed by Jeh and Widom, provides a good similarity score and has been successfully used in many applications such as web spam detection, collaborative tagging analysis, link prediction, and so on. SimRank computes similarities iteratively, and it needs O(N⁴T) time and O(N²) space for similarity computation where N and T are the number of nodes and iterations, respectively. Unfortunately, this iterative approach is computationally expensive. The goal of this work is to process top-k search and range search efficiently for a given node. Our solution, SimMat, is based on two ideas: (1) It computes the approximate similarity of a selected node pair efficiently in non-iterative style based on the Sylvester equation, and (2) It prunes unnecessary approximate similarity computations when searching for the high similarity nodes by exploiting estimations based on the Cauchy-Schwarz inequality. These two ideas reduce the time and space complexities of the proposed approach to O(Nn) where n is the target rank of the low-rank approximation (n ≪ N in practice). Our experiments show that our approach is much faster, by several orders of magnitude, than previous approaches in finding the high similarity nodes.

Keywords :

approximation theory; computational complexity; graph theory; search problems; Cauchy-Schwarz inequality; SimMat; SimRank; Sylvester equation; Web spam detection; Webpage; collaborative tagging analysis; data structure; graph; link prediction; low-rank approximation; microblog; music; noniterative style; object similarity; photo; range search; search algorithm; similarity computation; similarity node; similarity score; social networking service user; space complexity; time complexity; top-k search; Approximation methods; Eigenvalues and eigenfunctions; Equations; Iterative methods; Mathematical model; Social network services; Vectors;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Data Engineering (ICDE), 2013 IEEE 29th International Conference on

Conference_Location :

Brisbane, QLD

ISSN :

1063-6382

Print_ISBN :

978-1-4673-4909-3

Electronic_ISBN :

1063-6382

Type :

conf

DOI :

10.1109/ICDE.2013.6544858

Filename :

6544858

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=610353