Title :
Weight of evidence as a tool for attribute transformation in the preprocessing stage of supervised learning algorithms
Author :
Zdravevski, Eftim ; Lameski, Petre ; Kulakov, Andrea
Author_Institution :
NI TEKNA - Intell. Technol., Negotino, Macedonia
fDate :
July 31 2011-Aug. 5 2011
Abstract :
Transformation of features is a common task in the data preprocessing stage while solving data mining and classification problems. Many classification algorithms have preference of continual attributes over nominal attributes, and sometimes the distance between different data points cannot be estimated if the values of the attributes are not continual and normalized. The Weight of Evidence has some very desirable properties that make it very useful tool for the transformation of attributes, but unfortunately there are some preconditions that need to be met in order to calculate it. In this paper we propose a modified calculation of the Weight of Evidence that overcomes these preconditions, and additionally makes it usable for test examples that were not present in the training set. The proposed transformation can be used for all supervised learning problems. At the end, we present the results from the proposed transformation and discuss the benefits of the transformed nominal and continual attributes from the PAKDD 2009 dataset. The results show that the proposed transformation contributes towards a better performance in all tested classification algorithms than the method that generates dummy (i.e. binary) variables for each value of the nominal attributes.
Keywords :
data mining; learning (artificial intelligence); pattern classification; problem solving; statistical analysis; attribute transformation; data classification; data mining; data preprocessing stage; problem solving; supervised learning algorithm; training set; weight of evidence; Classification algorithms; Data models; Equations; Estimation; Mathematical model; Training data; Transforms; data preprocessing; data transformation; feature selection; information value; weight of evidence;
Conference_Titel :
Neural Networks (IJCNN), The 2011 International Joint Conference on
Conference_Location :
San Jose, CA
Print_ISBN :
978-1-4244-9635-8
DOI :
10.1109/IJCNN.2011.6033219