• DocumentCode
    1159833
  • Title

    Association rule mining in peer-to-peer systems

  • Author

    Wolff, Ran ; Schuster, Assaf

  • Author_Institution
    Comput. Sci. Dept., Technion-Israel Inst. of Technol., Haifa, Israel
  • Volume
    34
  • Issue
    6
  • fYear
    2004
  • Firstpage
    2426
  • Lastpage
    2438
  • Abstract
    We extend the problem of association rule mining-a key data mining problem-to systems in which the database is partitioned among a very large number of computers that are dispersed over a wide area. Such computing systems include grid computing platforms, federated database systems, and peer-to-peer computing environments. The scale of these systems poses several difficulties, such as the impracticality of global communications and global synchronization, dynamic topology changes of the network, on-the-fly data updates, the need to share resources with other applications, and the frequent failure and recovery of resources. We present an algorithm by which every node in the system can reach the exact solution, as if it were given the combined database. The algorithm is entirely asynchronous, imposes very little communication overhead, transparently tolerates network topology changes and node failures, and quickly adjusts to changes in the data as they occur. Simulation of up to 10 000 nodes show that the algorithm is local: all rules, except for those whose confidence is about equal to the confidence threshold, are discovered using information gathered from a very small vicinity, whose size is independent of the size of the system.
  • Keywords
    data mining; distributed databases; grid computing; knowledge based systems; network topology; association rule mining; data mining problem; federated database system; global synchronization; grid computing; network topology; peer-to-peer system; Association rules; Computer networks; Data mining; Database systems; Distributed databases; Grid computing; Image databases; Network topology; Peer to peer computing; Transaction databases; Anytime algorithms; association rule-mining; data mining; local algorithms; peer-to-peer; Algorithms; Artificial Intelligence; Computer Communication Networks; Database Management Systems; Databases, Factual; Information Dissemination; Information Storage and Retrieval; Pattern Recognition, Automated;
  • fLanguage
    English
  • Journal_Title
    Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1083-4419
  • Type

    jour

  • DOI
    10.1109/TSMCB.2004.836888
  • Filename
    1356034