Title :
Dealing with Missing Values in a Probabilistic Decision Tree during Classification
Author :
Hawarah, Lamis ; Simonet, A. ; Simonet, Ana
Author_Institution :
Inst. d´´Ingenierie et de l´´Information de Sante, TIMC-IMAG, La Tronche
Abstract :
This paper deals with the problem of missing values in decision trees during classification. Our approach is derived from the ordered attribute trees method, proposed by Lobo and Numao in 2000, which builds a decision tree for each attribute and uses these trees to fill the missing attribute values. Our method takes into account the dependence between attributes by using mutual information. The result of the classification process is a probability distribution instead of a single class. In this paper, we present tests performed on several databases using our approach and Quinlan´s method. We also measure the quality of our classification results. Finally, we discuss some perspectives
Keywords :
decision trees; pattern classification; statistical distributions; Quinlan ´s method; classification process; missing attribute values; mutual information; ordered attribute trees method; probabilistic decision tree; probability distribution; Classification tree analysis; Data mining; Databases; Decision trees; Frequency; Machine learning algorithms; Mutual information; Performance evaluation; Testing; Training data;
Conference_Titel :
Data Mining Workshops, 2006. ICDM Workshops 2006. Sixth IEEE International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
0-7695-2702-7
DOI :
10.1109/ICDMW.2006.56