• DocumentCode
    48821
  • Title

    HeteSim: A General Framework for Relevance Measure in Heterogeneous Networks

  • Author

    Chuan Shi ; Xiangnan Kong ; Yue Huang ; Yu, Philip S. ; Bin Wu

  • Author_Institution
    Beijing Univ. of Posts & Telecommun., Beijing, China
  • Volume
    26
  • Issue
    10
  • fYear
    2014
  • fDate
    Oct. 2014
  • Firstpage
    2479
  • Lastpage
    2492
  • Abstract
    Similarity search is an important function in many applications, which usually focuses on measuring the similarity between objects with the same type. However, in many scenarios, we need to measure the relatedness between objects with different types. With the surge of study on heterogeneous networks, the relevance measure on objects with different types becomes increasingly important. In this paper, we study the relevance search problem in heterogeneous networks, where the task is to measure the relatedness of heterogeneous objects (including objects with the same type or different types). A novel measure HeteSim is proposed, which has the following attributes: (1) a uniform measure: it can measure the relatedness of objects with the same or different types in a uniform framework; (2) a path-constrained measure: the relatedness of object pairs are defined based on the search path that connects two objects through following a sequence of node types; (3) a semi-metric measure: HeteSim has some good properties (e.g., selfmaximum and symmetric), which are crucial to many data mining tasks. Moreover, we analyze the computation characteristics of HeteSim and propose the corresponding quick computation strategies. Empirical studies show that HeteSim can effectively and efficiently evaluate the relatedness of heterogeneous objects.
  • Keywords
    search problems; HeteSim; data mining; general framework; heterogeneous networks; relevance search problem; similarity search; uniform framework; Collaboration; Data mining; Educational institutions; Electronic mail; Joining processes; Search problems; Semantics; Heterogeneous information network; pair-wise random walk; relevance measure; similarity search;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2013.2297920
  • Filename
    6702458