Title :
Optimal partitioning for classification and regression trees
Author_Institution :
Dept. of Electr. Eng., Standford Univ., CA, USA
fDate :
4/1/1991 12:00:00 AM
Abstract :
An iterative algorithm that finds a locally optimal partition for an arbitrary loss function, in time linear in N for each iteration is presented. The algorithm is a K-means-like clustering algorithm that uses as its distance measure a generalization of Kullback´s information divergence. Moreover, it is proven that the globally optimal partition must satisfy a nearest neighbour condition using divergence as the distance measure. These results generalize similar results of L. Breiman et al. (1984) to an arbitrary number of classes or regression variables and to an arbitrary number of bills. Experimental results on a text-to-speech example are provided and additional applications of the algorithm, including the design of variable combinations, surrogate splits, composite nodes, and decision graphs, are suggested
Keywords :
decision theory; iterative methods; speech recognition; trees (mathematics); Kullback´s information divergence; clustering algorithm; composite nodes; decision graphs; iterative algorithm; partitioning; regression trees; speech recognition; surrogate splits; text-to-speech; Bars; Classification tree analysis; Clustering algorithms; Decision trees; Iterative algorithms; Nearest neighbor searches; Optical character recognition software; Partitioning algorithms; Regression tree analysis; Testing;
Journal_Title :
Pattern Analysis and Machine Intelligence, IEEE Transactions on