DocumentCode :
1797733
Title :
Imputation of missing data supported by Complete p-Partite attribute-based Decision Graphs
Author :
Bertini, J.R. ; do Carmo Nicoletti, Maria ; Liang Zhao
Author_Institution :
Comput. Sci. Dept., Univ. of Sao Paulo, Sao Paulo, Brazil
fYear :
2014
fDate :
6-11 July 2014
Firstpage :
1100
Lastpage :
1106
Abstract :
Missing attribute values is a recurrent problem in data mining and machine learning. Although there are plenty of techniques to handle this problem, most of them are too simplistic to provide a good estimation for absent attribute values. A very active research area focuses on solving the missing attribute value problem via imputation methods, which replaces missing data with substituted values. This paper proposes a new imputation method which uses a special graph named Complete p-Partite Attribute-based Decision Graphs (CpP-AbDG) to estimate, in a consistent and plausible way, the missing values. The graph is built by considering the range of each attribute that describes the data divided into sub-intervals; sub-intervals are approached as the vertices of a graph. Edges are then established between pairs of different vertices, provided they do not related to the same attribute. The edges and vertices are finally assigned a weight, based on distributions of the classes. The resulting CpP-AbDG has shown to be a suitable and informative data structure for finding the proper interval in which a missing attribute value should lie, taking into account all the attributes that describe the data. Results comparing the proposed approach to classical ones in an computational environment that considers classification problems as an evaluation criteria, show the potential of the method.
Keywords :
data mining; graph theory; learning (artificial intelligence); CpP-AbDG; complete p-partite attribute-based decision graphs; data mining; data structure; machine learning; missing attribute values; Algorithm design and analysis; Data models; Educational institutions; Electronic mail; Machine learning algorithms; Training; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks (IJCNN), 2014 International Joint Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4799-6627-1
Type :
conf
DOI :
10.1109/IJCNN.2014.6889593
Filename :
6889593
Link To Document :
بازگشت