DocumentCode
788407
Title
Analysis of a complexity-based pruning scheme for classification trees
Author
Nobel, Andrew B.
Author_Institution
Dept. of Stat., North Carolina Univ., Chapel Hill, NC, USA
Volume
48
Issue
8
fYear
2002
fDate
8/1/2002 12:00:00 AM
Firstpage
2362
Lastpage
2368
Abstract
A complexity-based pruning procedure for classification trees is described, and bounds on its finite sample performance are established. The procedure selects a subtree of a (possibly random) initial tree in order to minimize a complexity penalized measure of empirical risk. The complexity assigned to a subtree is proportional to the square root of its size. Two cases are considered. In the first, the growing and pruning data sets are identical, and in the second, they are independent Using the performance bound, the Bayes risk consistency of pruned trees obtained via the procedure is established when the sequence of initial trees satisfies suitable geometric and structural constraints. The pruning method and its analysis are motivated by work on adaptive model selection using complexity regularization.
Keywords
Bayes methods; adaptive systems; computational complexity; trees (mathematics); Bayes risk consistency; adaptive model selection; classification trees; complexity penalized measure minimization; complexity regularization; complexity-based pruning; finite sample performance bounds; geometric constraints; growing data set; performance bound; pruning data set; random initial tree; structural constraints; subtree; Classification tree analysis; Cost function; Decision trees; Diseases; Histograms; Pattern recognition; Performance evaluation; Random variables; Regression tree analysis; Testing;
fLanguage
English
Journal_Title
Information Theory, IEEE Transactions on
Publisher
ieee
ISSN
0018-9448
Type
jour
DOI
10.1109/TIT.2002.800482
Filename
1019844
Link To Document