• DocumentCode
    3129628
  • Title

    Automatically Spotting Information-Rich Nodes in Graphs

  • Author

    He, Xiao ; Feng, Jing ; Plant, Claudia

  • Author_Institution
    Univ. of Munich, Munich, Germany
  • fYear
    2011
  • fDate
    11-11 Dec. 2011
  • Firstpage
    941
  • Lastpage
    948
  • Abstract
    How to automatically spot the most interesting nodes in a large graph? Although graph mining has attracted lots of attention this question remains challenging for the following reasons: (1) It is not trivial to define what an interesting node is. In some cases, application-specific background knowledge may help but is not always available. Moreover, the notion of interestingness depends on the structure of the graph. In a sparse graph, a hub or a node at the center of a star-like or clique-like sub graph might be interesting. However, if the graph is densely connected, a remote node might be more interesting. (2) Most existing approaches to detect abnormal nodes involve parameters which are difficult to estimate. Our approach addresses both challenges by considering the problem from an information-theoretic perspective. An interesting node exhibits an outstanding link pattern which contributes much nonredundant information to the overall information content of the graph. Based on this basic idea, we define an intuitive similarity measure between nodes. Combined with metric embedding, this similarity measure allows spotting the most interesting nodes at first glance. Moreover, we propose Info-spot, a parameter free and efficient algorithm for fully automatic detection of interesting nodes. Guided by the objective of data compression, Info-spot greedily merges nodes exhibiting similar link patterns, while exceptional information-rich nodes naturally emerge. Extensive experiments demonstrate the benefits of our approach.
  • Keywords
    data mining; graph theory; abnormal node detection; automatic information-rich node spotting; automatic interesting node detection; clique-like subgraph; graph mining; info-spot algorithm; information-theoretic perspective; sparse graph; star-like subgraph; Complexity theory; Data mining; Encoding; Entropy; Merging; Runtime; Vectors; Automatically Detection; Information-theoretic; Outstanding Nodes; Rich Link Pattern;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining Workshops (ICDMW), 2011 IEEE 11th International Conference on
  • Conference_Location
    Vancouver, BC
  • Print_ISBN
    978-1-4673-0005-6
  • Type

    conf

  • DOI
    10.1109/ICDMW.2011.37
  • Filename
    6137482