• DocumentCode
    945789
  • Title

    Maximal Biclique Subgraphs and Closed Pattern Pairs of the Adjacency Matrix: A One-to-One Correspondence and Mining Algorithms

  • Author

    Li, Jinyan ; Liu, Guimei ; Li, Haiquan ; Wong, Limsoon

  • Author_Institution
    Nanyang Technol. Univ., Singapore
  • Volume
    19
  • Issue
    12
  • fYear
    2007
  • Firstpage
    1625
  • Lastpage
    1637
  • Abstract
    Maximal biclique (also known as complete bipartite) subgraphs can model many applications in Web mining, business, and bioinformatics. Enumerating maximal biclique subgraphs from a graph is a computationally challenging problem, as the size of the output can become exponentially large with respect to the vertex number when the graph grows. In this paper, we efficiently enumerate them through the use of closed patterns of the adjacency matrix of the graph. For an undirected graph G without self-loops, we prove that 1) the number of closed patterns in the adjacency matrix of G is even, 2) the number of the closed patterns is precisely double the number of maximal biclique subgraphs of G, and 3) for every maximal biclique subgraph, there always exists a unique pair of closed patterns that matches the two vertex sets of the subgraph. Therefore, the problem of enumerating maximal bicliques can be solved by using efficient algorithms for mining closed patterns, which are algorithms extensively studied in the data mining field. However, this direct use of existing algorithms causes a duplicated enumeration. To achieve high efficiency, we propose an O(mn) time delay algorithm for a nonduplicated enumeration, in particular, for enumerating those maximal bicliques with a large size, where m and n. are the number of edges and vertices of the graph, respectively. We evaluate the high efficiency of our algorithm by comparing it to state- of-the-art algorithms on three categories of graphs: randomly generated graphs, benchmarks, and a real-life protein interaction network. In this paper, we also prove that if self-loops are allowed in a graph, then the number of closed patterns in the adjacency matrix is not necessarily even, but the maximal bicliques are exactly the same as those of the graph after removing all the self-loops.
  • Keywords
    data mining; graph theory; matrix algebra; adjacency matrix; closed pattern pair; data mining; maximal biclique subgraph; mining algorithm; one-to-one correspondence; time delay algorithm; Graph algorithms; Mining methods and algorithms;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2007.190660
  • Filename
    4358946