• DocumentCode
    2864564
  • Title

    A thorough experimental study of datasets for frequent itemsets

  • Author

    Flouvat, Frédéric ; De March, F. ; Petit, Jean-Marc

  • Author_Institution
    Lab. LIMOS, UMR CNRS 6158, Univ. Clermont-Ferrand II, Aubiere, France
  • fYear
    2005
  • fDate
    27-30 Nov. 2005
  • Abstract
    The discovery of frequent patterns is a famous problem in data mining. While plenty of algorithms have been proposed during the last decade, only a few contributions have tried to understand the influence of datasets on the algorithms behavior. Being able to explain why certain algorithms are likely to perform very well or very poorly on some datasets is still an open question. In this setting, we describe a thorough experimental study of datasets with respect to frequent item sets. We study the distribution of frequent item sets with respect to item sets size together with the distribution of three concise representations: frequent closed, frequent free and frequent essential item sets. For each of them, we also study the distribution of their positive and negative borders whenever possible. From this analysis, we exhibit a new characterization of datasets and some invariants allowing to better predict the behavior of well known algorithms. The main perspective of this work is to devise adaptive algorithms with respect to dataset characteristics.
  • Keywords
    data mining; data mining; frequent closed item set; frequent essential item set; frequent free item set; frequent item set; frequent patterns; Adaptive algorithm; Algorithm design and analysis; Association rules; Classification algorithms; Conferences; Data mining; Data structures; Itemsets; Statistical distributions;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, Fifth IEEE International Conference on
  • ISSN
    1550-4786
  • Print_ISBN
    0-7695-2278-5
  • Type

    conf

  • DOI
    10.1109/ICDM.2005.15
  • Filename
    1565675