DocumentCode :
891352
Title :
Decision tree design from a communication theory standpoint
Author :
Goodman, Rodney M. ; Smyth, Padhraic
Author_Institution :
Dept. of Electr. Eng., California Inst. of Technol., Pasadena, CA, USA
Volume :
34
Issue :
5
fYear :
1988
fDate :
9/1/1988 12:00:00 AM
Firstpage :
979
Lastpage :
994
Abstract :
A communication theory approach to decision tree design based on a top-town mutual information algorithm is presented. It is shown that this algorithm is equivalent to a form of Shannon-Fano prefix coding, and several fundamental bounds relating decision-tree parameters are derived. The bounds are used in conjunction with a rate-distortion interpretation of tree design to explain several phenomena previously observed in practical decision-tree design. A termination rule for the algorithm called the delta-entropy rule is proposed that improves its robustness in the presence of noise. Simulation results are presented, showing that the tree classifiers derived by the algorithm compare favourably to the single nearest neighbour classifier
Keywords :
decision theory; encoding; information theory; trees (mathematics); Shannon-Fano prefix coding; communication theory; decision tree design; delta-entropy rule; rate-distortion interpretation; single nearest neighbour classifier; termination rule; top-town mutual information algorithm; Algorithm design and analysis; Classification tree analysis; Decision trees; Expert systems; Mutual information; Nearest neighbor searches; Noise robustness; Pattern recognition; Rate-distortion; Space technology;
fLanguage :
English
Journal_Title :
Information Theory, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9448
Type :
jour
DOI :
10.1109/18.21221
Filename :
21221
Link To Document :
بازگشت