DocumentCode :
2131023
Title :
Genetic Algorithm and Data Mining Techniques for Design Selection in Databases
Author :
Koukouvinos, Christos ; Parpoula, Christina ; Simos, Dimitris E.
Author_Institution :
Dept. of Math., Nat. Tech. Univ. of Athens, Athens, Greece
fYear :
2013
fDate :
2-6 Sept. 2013
Firstpage :
743
Lastpage :
746
Abstract :
Nowadays, variable selection is fundamental to large dimensional statistical modelling problems, since large databases exist in diverse fields of science. In this paper, we benefit from the use of data mining tools and experimental designs in databases in order to select the most relevant variables for classification in regression problems in cases where observations and labels of a real-world dataset are available. Specifically, this study is of particular interest to use health data to identify the most significant variables containing all the necessary important information for classification and prediction of new data with respect to a certain effect (survival or death). The main goal is to determine the most important variables using methods that arise from the field of design of experiments combined with algorithmic concepts derived from data mining and metaheuristics. Our approach seems promising, since we are able to retrieve an optimal plan using only 6 runs of the available 8862 runs.
Keywords :
data mining; design of experiments; genetic algorithms; health care; medical information systems; pattern classification; regression analysis; support vector machines; very large databases; association rule mining; data classification; data mining techniques; data prediction; design selection; design-of-experiments; genetic algorithm; health data; large databases; large dimensional statistical modelling problems; metaheuristic algorithms; regression problems; support vector machines; variable selection; Algorithm design and analysis; Association rules; Databases; Genetic algorithms; Input variables; Support vector machines; association rule mining; design of experiments; feature selection; large dimensional data; metaheuristics; sensitivity analysis; support vector machines;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Availability, Reliability and Security (ARES), 2013 Eighth International Conference on
Conference_Location :
Regensburg
Type :
conf
DOI :
10.1109/ARES.2013.98
Filename :
6657314
Link To Document :
بازگشت