DocumentCode :
3125292
Title :
On Generating All Optimal Monotone Classifications
Author :
Stegeman, Luite ; Feelders, Ad
Author_Institution :
Univ. Utrecht, Utrecht, Netherlands
fYear :
2011
fDate :
11-14 Dec. 2011
Firstpage :
685
Lastpage :
694
Abstract :
In many applications of data mining one knows beforehand that the response variable should be monotone (either increasing or decreasing) in the attributes. In ordinal classification, changing the class labels of a data set (relabeling) so that the data becomes monotone, is useful for at least two reasons. Firstly, models trained on relabeled data tend to have better predictive performance than models trained on the original data. Secondly, relabeling is an important building block for the construction of monotone classifiers. However, optimal monotone relabelings are rarely unique, and so far an efficient algorithm to generate them all has been lacking. The main result of this paper is an efficient algorithm to produce the structure of all optimal monotone relabelings. We also show that counting the solutions is #P-complete and give algorithms for efficiently enumerating all solutions, as well as sampling uniformly from the set of solutions. Experiments show that relabeling non-monotone data can improve the predictive performance of models trained on that data.
Keywords :
data mining; pattern classification; data mining; data set; optimal monotone classifications; optimal monotone relabelings; ordinal classification; Bismuth; Data mining; Data models; Partitioning algorithms; Prediction algorithms; Predictive models; Vectors; isotonic regression; monotone classification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining (ICDM), 2011 IEEE 11th International Conference on
Conference_Location :
Vancouver,BC
ISSN :
1550-4786
Print_ISBN :
978-1-4577-2075-8
Type :
conf
DOI :
10.1109/ICDM.2011.111
Filename :
6137273
Link To Document :
بازگشت