DocumentCode
2370909
Title
Model stability: a key factor in determining whether an algorithm produces an optimal model from a matching distribution
Author
Ting, Kai Ming ; Quek, Regina Jing Ying
Author_Institution
Gippsland Sch. of Comput. & Inf. Technol., Monash Univ., Clayton, Vic., Australia
fYear
2003
fDate
19-22 Nov. 2003
Firstpage
653
Lastpage
656
Abstract
We investigate the factors leading to producing suboptimal models when training and test class distributions (or misclassification costs) are matched. Our result shows that model stability plays a key role in determining whether the algorithm produces an optimal model from a matching distribution (cost). The performance difference between a model trained from the matching distribution (cost) and the optimal model generally increases as the degree of model stability decreases. The practical implication of our result is that one should only follow the conventional wisdom of using a training class distribution (cost) that matches the test class distribution (cost) to train a classifier if the learning algorithm is known to be stable.
Keywords
Bayes methods; decision trees; learning (artificial intelligence); pattern classification; learning algorithm; matching distribution; model stability; optimal model; test class distribution; training class distribution; Bayesian methods; Cost function; Decision trees; Distributed computing; Information technology; Optimal control; Size control; Stability criteria; Terminology; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining, 2003. ICDM 2003. Third IEEE International Conference on
Print_ISBN
0-7695-1978-4
Type
conf
DOI
10.1109/ICDM.2003.1251000
Filename
1251000
Link To Document