DocumentCode :
788407
Title :
Analysis of a complexity-based pruning scheme for classification trees
Author :
Nobel, Andrew B.
Author_Institution :
Dept. of Stat., North Carolina Univ., Chapel Hill, NC, USA
Volume :
48
Issue :
8
fYear :
2002
fDate :
8/1/2002 12:00:00 AM
Firstpage :
2362
Lastpage :
2368
Abstract :
A complexity-based pruning procedure for classification trees is described, and bounds on its finite sample performance are established. The procedure selects a subtree of a (possibly random) initial tree in order to minimize a complexity penalized measure of empirical risk. The complexity assigned to a subtree is proportional to the square root of its size. Two cases are considered. In the first, the growing and pruning data sets are identical, and in the second, they are independent Using the performance bound, the Bayes risk consistency of pruned trees obtained via the procedure is established when the sequence of initial trees satisfies suitable geometric and structural constraints. The pruning method and its analysis are motivated by work on adaptive model selection using complexity regularization.
Keywords :
Bayes methods; adaptive systems; computational complexity; trees (mathematics); Bayes risk consistency; adaptive model selection; classification trees; complexity penalized measure minimization; complexity regularization; complexity-based pruning; finite sample performance bounds; geometric constraints; growing data set; performance bound; pruning data set; random initial tree; structural constraints; subtree; Classification tree analysis; Cost function; Decision trees; Diseases; Histograms; Pattern recognition; Performance evaluation; Random variables; Regression tree analysis; Testing;
fLanguage :
English
Journal_Title :
Information Theory, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9448
Type :
jour
DOI :
10.1109/TIT.2002.800482
Filename :
1019844
Link To Document :
بازگشت