DocumentCode
1961716
Title
CMP: a fast decision tree classifier using multivariate predictions
Author
Wang, Haixun ; Zaniolo, Carlo
Author_Institution
Dept. of Comput. Sci., California Univ., Los Angeles, CA, USA
fYear
2000
fDate
2000
Firstpage
449
Lastpage
460
Abstract
Most decision tree classifiers are designed to keep class histograms for single attributes, and to select a particular attribute for the next split using said histograms. We propose a technique where, by keeping histograms on attribute pairs, we achieve: a significant speed-up over traditional classifiers based on single attribute splitting; and the ability of building classifiers that use linear combinations of values from non-categorical attribute pairs as split criterion. Indeed, by keeping two-dimensional histograms, CMP can often predict the best successive split, in addition to computing the current one; therefore, CMP is normally able to grow more than one level of a decision tree for each data scan. CMP´s performance improvements are also due to techniques whereby non-categorical attributes are discretized without loss in classification accuracy; in fact, we introduce simple techniques, whereby classification errors caused by discretization at one step can then be corrected in the following step. In summary, CMP represents a unified algorithm that extends the functionality of existing classifiers and improves their performance
Keywords
classification; data mining; database theory; decision trees; software performance evaluation; very large databases; CMP; attribute pairs; class histograms; data mining; fast decision tree classifier; multivariate predictions; performance improvements; single attribute splitting; Classification tree analysis; Data mining; Databases; Decision trees; Ear; Genetics; Histograms; Machine learning; Read only memory; Statistics;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Engineering, 2000. Proceedings. 16th International Conference on
Conference_Location
San Diego, CA
ISSN
1063-6382
Print_ISBN
0-7695-0506-6
Type
conf
DOI
10.1109/ICDE.2000.839444
Filename
839444
Link To Document