• DocumentCode
    3601220
  • Title

    Active Learning-Based Pedagogical Rule Extraction

  • Author

    de Fortuny, Enric Junque ; Martens, David

  • Author_Institution
    INSEAD, Fontainebleau, France
  • Volume
    26
  • Issue
    11
  • fYear
    2015
  • Firstpage
    2664
  • Lastpage
    2677
  • Abstract
    Many of the state-of-the-art data mining techniques introduce nonlinearities in their models to cope with complex data relationships effectively. Although such techniques are consistently included among the top classification techniques in terms of predictive power, their lack of transparency renders them useless in any domain where comprehensibility is of importance. Rule-extraction algorithms remedy this by distilling comprehensible rule sets from complex models that explain how the classifications are made. This paper considers a new rule extraction technique, based on active learning. The technique generates artificial data points around training data with low confidence in the output score, after which these are labeled by the black-box model. The main novelty of the proposed method is that it uses a pedagogical approach without making any architectural assumptions of the underlying model. It can therefore be applied to any black-box technique. Furthermore, it can generate any rule format, depending on the chosen underlying rule induction technique. In a large-scale empirical study, we demonstrate the validity of our technique to extract trees and rules from artificial neural networks, support vector machines, and random forests, on 25 data sets of varying size and dimensionality. Our results show that not only do the generated rules explain the black-box models well (thereby facilitating the acceptance of such models), the proposed algorithm also performs significantly better than traditional rule induction techniques in terms of accuracy as well as fidelity.
  • Keywords
    data mining; learning (artificial intelligence); neural nets; support vector machines; ANN; SVM; active learning; artificial data points; artificial neural networks; black-box technique; comprehensible rule sets; data mining techniques; pedagogical rule extraction; random forests; rule induction technique; support vector machines; training data; Accuracy; Data mining; Data models; Feature extraction; Predictive models; Support vector machines; Training; Active learning; comprehensibility; neural network; random forest (RF); rule extraction; support vector machine (SVM); support vector machine (SVM).;
  • fLanguage
    English
  • Journal_Title
    Neural Networks and Learning Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    2162-237X
  • Type

    jour

  • DOI
    10.1109/TNNLS.2015.2389037
  • Filename
    7018925