• DocumentCode
    3645309
  • Title

    Mining Dominant Patterns in the Sky

  • Author

    Arnaud Soulet;Chedy Raïssi;Marc Plantevit;Bruno Cremilleux

  • Author_Institution
    LI, Univ. Francois Rabelais de Tours, Tours, France
  • fYear
    2011
  • Firstpage
    655
  • Lastpage
    664
  • Abstract
    Pattern discovery is at the core of numerous data mining tasks. Although many methods focus on efficiency in pattern mining, they still suffer from the problem of choosing a threshold that influences the final extraction result. The goal of our study is to make the results of pattern mining useful from a user-preference point of view. To this end, we integrate into the pattern discovery process the idea of skyline queries in order to mine skyline patterns in a threshold-free manner. Because the skyline patterns satisfy a formal property of dominations, they not only have a global interest but also have semantics that are easily understood by the user. In this work, we first establish theoretical relationships between pattern condensed representations and skyline pattern mining. We also show that it is possible to compute automatically a subset of measures involved in the user query which allows the patterns to be condensed and thus facilitates the computation of the skyline patterns. This forms the basis for a novel approach to mining skyline patterns. We illustrate the efficiency of our approach over several data sets including a use case from chemo informatics and show that small sets of dominant patterns are produced under various measures.
  • Keywords
    "Data mining","Frequency measurement","Area measurement","Length measurement","Databases","Redundancy","Semantics"
  • Publisher
    ieee
  • Conference_Titel
    Data Mining (ICDM), 2011 IEEE 11th International Conference on
  • ISSN
    1550-4786
  • Print_ISBN
    978-1-4577-2075-8
  • Type

    conf

  • DOI
    10.1109/ICDM.2011.100
  • Filename
    6137270