• DocumentCode
    1948969
  • Title

    Concept Description - A Fresh Look

  • Author

    Sönströd, Cecilia ; Johansson, Ulf

  • Author_Institution
    Boras Univ., Boras
  • fYear
    2007
  • fDate
    12-17 Aug. 2007
  • Firstpage
    2415
  • Lastpage
    2420
  • Abstract
    The main purpose of this paper is to look into the data mining task concept description, for which several rather different definitions exist. We argue for the definition used by CRISP-DM, where the overall goal is expressed as "gaining insights". Based on this, we propose that the two most important criteria for concept description models are accuracy and comprehensibility. The demand for comprehensibility rules out a straightforward use of many high-accuracy predictive modeling techniques; e.g. neural networks. Instead, we introduce rule extraction from predictive models as an alternative technique for concept description. In the experimentation, we show, using ten publicly available data sets, that the rule extractor used is clearly able to produce accurate and comprehensible descriptions. In addition, we discuss how concept description performance could be measured to capture both accuracy and comprehensibility. Comprehensibility is often translated into size; i.e. a smaller model is deemed more comprehensible. In practice, however, it would probably make more sense to treat comprehensibility as a binary property -the description is either comprehensible or not. Regarding accuracy, we argue that accuracies obtained on unseen data provide better information than accuracy on the entire data set. The reason is not that the model should be used for prediction, but that concepts found in this way are more likely to be general, and thus more informative.
  • Keywords
    data mining; concept comprehensibility; data mining task concept description; rule extraction; Advertising; Artificial neural networks; Data mining; Measurement standards; Neural networks; Predictive models; Support vector machines; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 2007. IJCNN 2007. International Joint Conference on
  • Conference_Location
    Orlando, FL
  • ISSN
    1098-7576
  • Print_ISBN
    978-1-4244-1379-9
  • Electronic_ISBN
    1098-7576
  • Type

    conf

  • DOI
    10.1109/IJCNN.2007.4371336
  • Filename
    4371336