DocumentCode :
553086
Title :
Imputation of missing data using ensemble algorithms
Author :
Xiaoling Lu ; Jiesheng Si ; Lanfeng Pan ; Yanyun Zhao
Author_Institution :
Center for Appl. Stat., Renmin Univ. of China, Beijing, China
Volume :
2
fYear :
2011
fDate :
26-28 July 2011
Firstpage :
1312
Lastpage :
1315
Abstract :
Missing data or incomplete data are very common in statistical situations. One way to deal with missing data is to conduct model imputation either one time or multiple times. One of the key problems in analyzing the imputed dataset is to give the valid statistical reference of the parameter estimated, that is, to give a right estimation of the standard error of the interested statistic. This paper proposes the new developed ensemble algorithms as imputation model. In order to realize multiple imputation, we suggest bootstrap sampling the prediction error several times. The properties of the proposed methods are studied by simulation and compared with existing methods. Finally, the methods are applied to analyze one real large dataset, taking the missing mechanism into consideration.
Keywords :
data analysis; learning (artificial intelligence); parameter estimation; sampling methods; bootstrap sampling; ensemble algorithms; incomplete data; missing data imputation; multiple imputation; parameter estimation; statistical reference; statistical situation; supervised learning; Bagging; Boosting; Data models; Educational institutions; Estimation; Prediction algorithms; Predictive models; ensemble algorithm; imputation; missing data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2011 Eighth International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-61284-180-9
Type :
conf
DOI :
10.1109/FSKD.2011.6019647
Filename :
6019647
Link To Document :
بازگشت