Title :
Ensembles of cascading trees
Author :
Li, Jinyan ; Liu, Huiqing
Author_Institution :
Inst. for Infocomm Res., Singapore
Abstract :
We introduce a new method, called CS4, to construct committees of decision trees for classification. The method considers different top-ranked features as the root nodes of member trees. This idea is particularly suitable for dealing with high-dimensional bio-medical data as top-ranked features in this type of data usually possess similar merits for classification. To make a decision, the committee combines the power of individual trees in a weighted manner. Unlike Bagging or Boosting which uses bootstrapped training data, our method builds all the member trees of a committee using exactly the same set of training data. We have tested these ideas on UCI data sets as well as recent bio-medical data sets of gene expression or proteomic profiles that are usually described by more than 10,000 features. All the experimental results show that our method is efficient and that the classification performance are superior to C4.5 family algorithms.
Keywords :
decision trees; learning (artificial intelligence); medical information systems; CS4; UCI data sets; biomedical data; cascading trees; classification; decision tree; gene expression; Bagging; Bioinformatics; Boosting; Classification tree analysis; Decision trees; Gain measurement; Gene expression; Proteomics; Testing; Training data;
Conference_Titel :
Data Mining, 2003. ICDM 2003. Third IEEE International Conference on
Print_ISBN :
0-7695-1978-4
DOI :
10.1109/ICDM.2003.1250983