DocumentCode :
1782498
Title :
iDMI: A novel technique for missing value imputation using a decision tree and expectation-maximization algorithm
Author :
Rahman, Md Geaur ; Islam, Md Zahurul
Author_Institution :
Center for Res. in Complex Syst. (CRiCS), Charles Sturt Univ., Bathurst, NSW, Australia
fYear :
2014
fDate :
8-10 March 2014
Firstpage :
496
Lastpage :
501
Abstract :
In this paper we present a novel technique called iDMI that imputes missing values of a data set by combining a decision tree algorithm (DT) and an expectation-maximization (EMI) algorithm. We first divide a data set into horizontal segments through applying a DT algorithm such as C4.5, and then apply an EMI algorithm on each segment in order to impute the missing values belong to the segment. If all numerical attribute values of a record are missing then we impute them by the mean values of the attributes of the records belong to a segment where the record falls in, and thereby reduce the computational time complexity of iDMI compare to an existing technique called DMI which calculate the mean value of an attribute by using all records of a data set. We evaluate the performance of iDMI over three high quality existing techniques on two real data sets in terms of four evaluation criteria. Our initial experimental results, including several statistical significance analysis, indicate the superiority of iDMI over the existing techniques.
Keywords :
data mining; decision trees; expectation-maximisation algorithm; C4.5 algorithm; DT algorithm; EM algorithm; data mining; decision tree; expectation-maximization algorithm; iDMI technique; missing value imputation; statistical significance analysis; Accuracy; Computers; Correlation; Decision trees; Electromagnetic interference; Information technology; Remuneration; Data pre-processing; Decision Trees; EM algorithm; data cleansing; missing value imputation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer and Information Technology (ICCIT), 2013 16th International Conference on
Conference_Location :
Khulna
Type :
conf
DOI :
10.1109/ICCITechn.2014.6997351
Filename :
6997351
Link To Document :
بازگشت