• DocumentCode
    3167730
  • Title

    An Efficient Association Rule Mining Algorithm In Distributed Databases

  • Author

    Jian, Wu ; Ming, Li Xing

  • Author_Institution
    Univ. of Electron. Sci. & Technol. of China, Chengdu
  • fYear
    2008
  • fDate
    23-24 Jan. 2008
  • Firstpage
    108
  • Lastpage
    113
  • Abstract
    This paper describes the alarm correlation in communication networks based on data mining. A direct application of sequential algorithms to distributed databases is not effective, because it requires a large amount of communication overhead. In our study, an efficient algorithm, EDMA, is proposed. It minimizes the number of candidate sets and exchange messages by local and global pruning. In local sites, it runs the application based on the improved algorithm-CMatrix, which is used to calculate local support counts. By numbering the global frequent itemsets generated at the end of k-th iteration from 1 to m, the algorithm codes every candidate (k+l)-itemset into a pair of those number formed as-(x,y) to compress the context transmitted and query corresponding support counts in CMatrix. Our solution also reduces the size of average transactions and datasets that leads to reduction of scan time. The performance study shows that EDMA has superior running efficiency, lower communication cost and stronger scalability than direct application of a sequential algorithm in distributed databases.
  • Keywords
    data mining; distributed databases; matrix algebra; message passing; CMatrix algorithm; EDMA algorithm; algorithm codes; association rule mining; communication network; distributed database; global pruning; local pruning; message exchange; Association rules; Communication networks; Costs; Data engineering; Data mining; Distributed databases; Itemsets; Knowledge engineering; Paper technology; Partitioning algorithms;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Knowledge Discovery and Data Mining, 2008. WKDD 2008. First International Workshop on
  • Conference_Location
    Adelaide, SA
  • Print_ISBN
    978-0-7695-3090-1
  • Type

    conf

  • DOI
    10.1109/WKDD.2008.33
  • Filename
    4470359