• DocumentCode
    2851645
  • Title

    A greedy algorithm for selecting models in ensembles

  • Author

    Turinsky, Andrei L. ; Grossman, Robert L.

  • Author_Institution
    Calgary Univ., Alta., Canada
  • fYear
    2004
  • fDate
    1-4 Nov. 2004
  • Firstpage
    547
  • Lastpage
    550
  • Abstract
    We are interested in ensembles of models built over k data sets. Common approaches are either to combine models by vote averaging, or to build a meta-model on the outputs of the local models. In this paper, we consider the model assignment approach, in which a meta-model selects one of the local statistical models for scoring. We introduce an algorithm called greedy data labeling (GDL) that improves the initial data partition by reallocating some data, so that when each model is built on its local data subset, the resulting hierarchical system has minimal error. We present evidence that model assignment may in certain situations be more natural than traditional ensemble learning, and if enhanced by GDL, it often outperforms traditional ensembles.
  • Keywords
    data mining; greedy algorithms; statistical analysis; ensemble learning; greedy algorithm; greedy data labeling; model assignment approach; statistical model; vote averaging; Boosting; Cardiac disease; Data mining; Greedy algorithms; Hierarchical systems; Labeling; Partitioning algorithms; Predictive models; Sampling methods; Voting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2004. ICDM '04. Fourth IEEE International Conference on
  • Print_ISBN
    0-7695-2142-8
  • Type

    conf

  • DOI
    10.1109/ICDM.2004.10009
  • Filename
    1410357