• DocumentCode
    390914
  • Title

    Association analysis with one scan of databases

  • Author

    Huang, Hao ; Wu, Xindong ; Relue, Richard

  • Author_Institution
    Dept. of Math & Comput. Sci., Colorado Sch. of Mines, Golden, CO, USA
  • fYear
    2002
  • fDate
    2002
  • Firstpage
    629
  • Lastpage
    632
  • Abstract
    Mining frequent patterns with an FP-tree avoids costly candidate generation and repeatedly occurrence frequency checking against the support threshold. It therefore achieves better performance and efficiency than Apriori-like algorithms. However the database still needs to be scanned twice to get the FP-tree. This can be very time-consuming when new data are added to an existing database because two scans may be needed for not only the new data but also the existing data. This paper presents a new data structure P-tree, Pattern Tree, and a new technique, which can get the P-tree through only one scan of the database and can obtain the corresponding FP-tree with a specified support threshold. Updating a P-tree with new data needs one scan of the new data only, and the existing data do not need to be re-scanned.
  • Keywords
    data mining; pattern recognition; tree data structures; very large databases; Apriori-like algorithms; FP-tree; P-tree data structure; Pattern Tree; association analysis; association rule; candidate generation; data mining; database scan; frequent pattern mining; large database; occurrence frequency checking; performance; support threshold; Association rules; Computer science; Data structures; Frequency; Itemsets; Iterative algorithms; Transaction databases; Tree data structures;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2002. ICDM 2003. Proceedings. 2002 IEEE International Conference on
  • Print_ISBN
    0-7695-1754-4
  • Type

    conf

  • DOI
    10.1109/ICDM.2002.1184015
  • Filename
    1184015