DocumentCode
2851645
Title
A greedy algorithm for selecting models in ensembles
Author
Turinsky, Andrei L. ; Grossman, Robert L.
Author_Institution
Calgary Univ., Alta., Canada
fYear
2004
fDate
1-4 Nov. 2004
Firstpage
547
Lastpage
550
Abstract
We are interested in ensembles of models built over k data sets. Common approaches are either to combine models by vote averaging, or to build a meta-model on the outputs of the local models. In this paper, we consider the model assignment approach, in which a meta-model selects one of the local statistical models for scoring. We introduce an algorithm called greedy data labeling (GDL) that improves the initial data partition by reallocating some data, so that when each model is built on its local data subset, the resulting hierarchical system has minimal error. We present evidence that model assignment may in certain situations be more natural than traditional ensemble learning, and if enhanced by GDL, it often outperforms traditional ensembles.
Keywords
data mining; greedy algorithms; statistical analysis; ensemble learning; greedy algorithm; greedy data labeling; model assignment approach; statistical model; vote averaging; Boosting; Cardiac disease; Data mining; Greedy algorithms; Hierarchical systems; Labeling; Partitioning algorithms; Predictive models; Sampling methods; Voting;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining, 2004. ICDM '04. Fourth IEEE International Conference on
Print_ISBN
0-7695-2142-8
Type
conf
DOI
10.1109/ICDM.2004.10009
Filename
1410357
Link To Document