• DocumentCode
    788407
  • Title

    Analysis of a complexity-based pruning scheme for classification trees

  • Author

    Nobel, Andrew B.

  • Author_Institution
    Dept. of Stat., North Carolina Univ., Chapel Hill, NC, USA
  • Volume
    48
  • Issue
    8
  • fYear
    2002
  • fDate
    8/1/2002 12:00:00 AM
  • Firstpage
    2362
  • Lastpage
    2368
  • Abstract
    A complexity-based pruning procedure for classification trees is described, and bounds on its finite sample performance are established. The procedure selects a subtree of a (possibly random) initial tree in order to minimize a complexity penalized measure of empirical risk. The complexity assigned to a subtree is proportional to the square root of its size. Two cases are considered. In the first, the growing and pruning data sets are identical, and in the second, they are independent Using the performance bound, the Bayes risk consistency of pruned trees obtained via the procedure is established when the sequence of initial trees satisfies suitable geometric and structural constraints. The pruning method and its analysis are motivated by work on adaptive model selection using complexity regularization.
  • Keywords
    Bayes methods; adaptive systems; computational complexity; trees (mathematics); Bayes risk consistency; adaptive model selection; classification trees; complexity penalized measure minimization; complexity regularization; complexity-based pruning; finite sample performance bounds; geometric constraints; growing data set; performance bound; pruning data set; random initial tree; structural constraints; subtree; Classification tree analysis; Cost function; Decision trees; Diseases; Histograms; Pattern recognition; Performance evaluation; Random variables; Regression tree analysis; Testing;
  • fLanguage
    English
  • Journal_Title
    Information Theory, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9448
  • Type

    jour

  • DOI
    10.1109/TIT.2002.800482
  • Filename
    1019844