• DocumentCode
    3125885
  • Title

    Positive and Unlabeled Learning for Graph Classification

  • Author

    Zhao, Yuchen ; Kong, Xiangnan ; Yu, Philip S.

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Illinois at Chicago, Chicago, IL, USA
  • fYear
    2011
  • fDate
    11-14 Dec. 2011
  • Firstpage
    962
  • Lastpage
    971
  • Abstract
    The problem of graph classification has drawn much attention in the last decade. Conventional approaches on graph classification focus on mining discriminative sub graph features under supervised settings. The feature selection strategies strictly follow the assumption that both positive and negative graphs exist. However, in many real-world applications, the negative graph examples are not available. In this paper we study the problem of how to select useful sub graph features and perform graph classification based upon only positive and unlabeled graphs. This problem is challenging and different from previous works on PU learning, because there are no predefined features in graph data. Moreover, the sub graph enumeration problem is NP-hard. We need to identify a subset of unlabeled graphs that are most likely to be negative graphs. However, the negative graph selection problem and the sub graph feature selection problem are correlated. Before the reliable negative graphs can be resolved, we need to have a set of useful sub graph features. In order to address this problem, we first derive an evaluation criterion to estimate the dependency between sub graph features and class labels based on a set of estimated negative graphs. In order to build accurate models for the PU learning problem on graph data, we propose an integrated approach to concurrently select the discriminative features and the negative graphs in an iterative manner. Experimental results illustrate the effectiveness and efficiency of the proposed method.
  • Keywords
    computational complexity; data mining; graph theory; learning (artificial intelligence); pattern classification; NP-hard problem; PU learning; discriminative subgraph features mining; graph classification; negative graph selection problem; positive learning; subgraph enumeration problem; unlabeled graphs; unlabeled learning; Data mining; Data models; Kernel; Optimization; Reliability; Support vector machines; Vectors; feature selection; graph classification; positive and unlabeled data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining (ICDM), 2011 IEEE 11th International Conference on
  • Conference_Location
    Vancouver,BC
  • ISSN
    1550-4786
  • Print_ISBN
    978-1-4577-2075-8
  • Type

    conf

  • DOI
    10.1109/ICDM.2011.119
  • Filename
    6137301