• DocumentCode
    2370909
  • Title

    Model stability: a key factor in determining whether an algorithm produces an optimal model from a matching distribution

  • Author

    Ting, Kai Ming ; Quek, Regina Jing Ying

  • Author_Institution
    Gippsland Sch. of Comput. & Inf. Technol., Monash Univ., Clayton, Vic., Australia
  • fYear
    2003
  • fDate
    19-22 Nov. 2003
  • Firstpage
    653
  • Lastpage
    656
  • Abstract
    We investigate the factors leading to producing suboptimal models when training and test class distributions (or misclassification costs) are matched. Our result shows that model stability plays a key role in determining whether the algorithm produces an optimal model from a matching distribution (cost). The performance difference between a model trained from the matching distribution (cost) and the optimal model generally increases as the degree of model stability decreases. The practical implication of our result is that one should only follow the conventional wisdom of using a training class distribution (cost) that matches the test class distribution (cost) to train a classifier if the learning algorithm is known to be stable.
  • Keywords
    Bayes methods; decision trees; learning (artificial intelligence); pattern classification; learning algorithm; matching distribution; model stability; optimal model; test class distribution; training class distribution; Bayesian methods; Cost function; Decision trees; Distributed computing; Information technology; Optimal control; Size control; Stability criteria; Terminology; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2003. ICDM 2003. Third IEEE International Conference on
  • Print_ISBN
    0-7695-1978-4
  • Type

    conf

  • DOI
    10.1109/ICDM.2003.1251000
  • Filename
    1251000