• DocumentCode
    70653
  • Title

    Random Projection Random Discretization Ensembles—Ensembles of Linear Multivariate Decision Trees

  • Author

    Ahmad, Ayaz ; Brown, G.

  • Author_Institution
    Fac. of Comput. & Inf. Technol., King Abdulaziz Univ., Rabigh, Saudi Arabia
  • Volume
    26
  • Issue
    5
  • fYear
    2014
  • fDate
    May-14
  • Firstpage
    1225
  • Lastpage
    1239
  • Abstract
    In this paper, we present a novel ensemble method random projection random discretization ensembles(RPRDE) to create ensembles of linear multivariate decision trees by using a univariate decision tree algorithm. The present method combines the better computational complexity of a univariate decision tree algorithm with the better representational power of linear multivariate decision trees. We develop random discretization (RD) method that creates random discretized features from continuous features. Random projection (RP) is used to create new features that are linear combinations of original features. A new dataset is created by augmenting discretized features (created by using RD) with features created by using RP. Each decision tree of a RPRD ensemble is trained on one dataset from the pool of these datasets by using a univariate decision tree algorithm. As these multivariate decision trees (because of features created by RP) have more representational power than univariate decision trees, we expect accurate decision trees in the ensemble. Diverse training datasets ensure diverse decision trees in the ensemble. We study the performance of RPRDE against other popular ensemble techniques using C4.5 tree as the base classifier. RPRDE matches or outperforms other popular ensemble methods. Experiments results also suggest that the proposed method is quite robust to the class noise.
  • Keywords
    computational complexity; decision trees; pattern classification; C4.5 tree; RPRDE; base classifier; computational complexity; linear multivariate decision trees; random discretized features; random projection random discretization ensembles; representational power; univariate decision tree algorithm; Bagging; Decision trees; Educational institutions; Noise; Principal component analysis; Training; Vegetation; Clustering; Data mining; Ensembles; and association rules; classification; decision trees; discretization; noise; random projections; randomization;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2013.134
  • Filename
    6574846