Title :
A Non-parametric Semi-supervised Discretization Method
Author :
Bondu, A. ; Boulle, M. ; Lemaire, V. ; Loiseau, S. ; Duval, B.
Author_Institution :
Orange Labs., Lannion
Abstract :
Semi-supervised classification methods aim to exploit labelled and unlabelled examples to train a predictive model. Most of these approaches make assumptions on the distribution of classes. This article first proposes a new semi-supervised discretization method which adopts very low informative prior on data. This method discretizes the numerical domain of a continuous input variable, while keeping the information relative to the prediction of classes. Then, an in-depth comparison of this semi-supervised method with the original supervised MODL approach is presented. We demonstrate that the semi-supervised approach is asymptotically equivalent to the supervised approach, improved with a post-optimization of the intervals bounds location.
Keywords :
data mining; learning (artificial intelligence); pattern classification; data mining; minimal optimized description length; nonparametric semisupervised discretization; predictive model; semisupervised classification; semisupervised learning; supervised MODL; Bayesian methods; Bonding; Classification algorithms; Data mining; Input variables; Iterative algorithms; Maximum likelihood estimation; Optimization methods; Predictive models; Semisupervised learning; Discretization; Non-parametric; Semi-supervised;
Conference_Titel :
Data Mining, 2008. ICDM '08. Eighth IEEE International Conference on
Conference_Location :
Pisa
Print_ISBN :
978-0-7695-3502-9
DOI :
10.1109/ICDM.2008.35