DocumentCode :
2711158
Title :
Filling in the Blanks - Krimp Minimisation for Missing Data
Author :
Vreeken, Jilles ; Siebes, Arno
Author_Institution :
Dept. of Comput. Sci., Univ. Utrecht, Utrecht
fYear :
2008
fDate :
15-19 Dec. 2008
Firstpage :
1067
Lastpage :
1072
Abstract :
Many data sets are incomplete. For correct analysis of such data, one can either use algorithms that are designed to handle missing data or use imputation. Imputation has the benefit that it allows for any type of data analysis. Obviously, this can only lead to proper conclusions if the provided data completion is both highly accurate and maintains all statistics of the original data. In this paper, we present three data completion methods that are built on the MDL-based KRIMP algorithm. Here, we also follow the MDL principle, i.e. the completed database that can be compressed best, is the best completion because it adheres best to the patterns in the data. By using local patterns, as opposed to a global model, KRIMP captures the structure of the data in detail. Experiments show that both in terms of accuracy and expected differences of any marginal, better data reconstructions are provided than the state of the art, Structural EM.
Keywords :
data analysis; data mining; KRIMP minimisation; data analysis; data sets; database; imputation; missing data; Algorithm design and analysis; Computer science; DNA; Data analysis; Data mining; Databases; Filling; Iterative methods; Statistical analysis; Statistics; Krimp; MDL; imputation; local patterns; missing data estimation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2008. ICDM '08. Eighth IEEE International Conference on
Conference_Location :
Pisa
ISSN :
1550-4786
Print_ISBN :
978-0-7695-3502-9
Type :
conf
DOI :
10.1109/ICDM.2008.40
Filename :
4781226
Link To Document :
بازگشت