DocumentCode :
2851645
Title :
A greedy algorithm for selecting models in ensembles
Author :
Turinsky, Andrei L. ; Grossman, Robert L.
Author_Institution :
Calgary Univ., Alta., Canada
fYear :
2004
fDate :
1-4 Nov. 2004
Firstpage :
547
Lastpage :
550
Abstract :
We are interested in ensembles of models built over k data sets. Common approaches are either to combine models by vote averaging, or to build a meta-model on the outputs of the local models. In this paper, we consider the model assignment approach, in which a meta-model selects one of the local statistical models for scoring. We introduce an algorithm called greedy data labeling (GDL) that improves the initial data partition by reallocating some data, so that when each model is built on its local data subset, the resulting hierarchical system has minimal error. We present evidence that model assignment may in certain situations be more natural than traditional ensemble learning, and if enhanced by GDL, it often outperforms traditional ensembles.
Keywords :
data mining; greedy algorithms; statistical analysis; ensemble learning; greedy algorithm; greedy data labeling; model assignment approach; statistical model; vote averaging; Boosting; Cardiac disease; Data mining; Greedy algorithms; Hierarchical systems; Labeling; Partitioning algorithms; Predictive models; Sampling methods; Voting;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2004. ICDM '04. Fourth IEEE International Conference on
Print_ISBN :
0-7695-2142-8
Type :
conf
DOI :
10.1109/ICDM.2004.10009
Filename :
1410357
Link To Document :
بازگشت