Title :
Missing categorical data imputation approach based on similarity
Author :
Wu, Sen ; Feng, Xiaodong ; Han, Yushan ; Wang, Qiang
Author_Institution :
Dongling Sch. of Econ. & Manage., Univ. of Sci. & Technol. Beijing, Beijing, China
Abstract :
Imputation for missing data is an important task of data mining, which may influence the data mining result. In this paper, Missing Categorical Data Imputation Based on Similarity (MIBOS) is proposed to solve this problem. The algorithm defines a similarity model between objects with incomplete data, constructing the similarity matrix of objects and further gets the nearest undifferentiated object sets of each object to impute the missing data iteratively. In the imputing process, the imputed value will be directly applied to the same iteration and the following iterations. Experiments with three UCI benchmark data sets show the improvement of the proposed algorithm from perspectives of complete rate, accuracy and time efficiency.
Keywords :
data mining; MIBOS; Missing Categorical Data Imputation Based on Similarity; UCI benchmark data sets; data mining; missing categorical data imputation approach; missing data; object similarity matrix; similarity model; Accuracy; Algorithm design and analysis; Data mining; Data models; Heart; Information systems; Single photon emission computed tomography; data mining; missing data imputation; rough set‥; similarity;
Conference_Titel :
Systems, Man, and Cybernetics (SMC), 2012 IEEE International Conference on
Conference_Location :
Seoul
Print_ISBN :
978-1-4673-1713-9
Electronic_ISBN :
978-1-4673-1712-2
DOI :
10.1109/ICSMC.2012.6378177