Title :
SVM with entropy regularization and particle swarm optimization for identifying children´s health and socioeconomic determinants of education attainments using linked datasets
Author :
Zhou, Shang-Ming ; Lyons, Ronan A. ; Bodger, Owen ; Demmler, Joanne C. ; Atkinson, Mark D.
Author_Institution :
UKCRC DECIPHer (Dev. & Evaluation of Complex Interventions for Public Health Improvement) Centre, Swansea Univ., Swansea, UK
Abstract :
Linking large disparate database systems at individual person based level for medical informatics and e-health research is a challenging task. In the interests of identifying influential determinants of effects of children´s health and socioeconomic status on educational attainment, this paper links child health data records including birth records, school key stage attainment records, deprivation index scores etc. from multiple sources via an e-health infrastructure SAIL databank. Furthermore, a novel scheme of automatically identifying influential attributes from high dimensional data is presented. The proposed scheme applies the entropy regularisation and particle swarm optimisation (PSO) techniques to the construction of an optimal support vector machine (SVM) model. The novelty of the proposed scheme lies in that during learning process the importance of less influential attributes automatically approaches to zero, whilst the importance of the most important attributes turns to one, so that only the most influential attributes turn up in the final SVM model. What´s more, the model selection, feature identification and dimensionality reduction are performed simultaneously in an integrated manner in one model structure. The experimental results have shown that the proposed method is efficient in performing dimensionality reduction and identifying the important determinants of the effects of children´s health and socioeconomic status on educational attainment.
Keywords :
data handling; database management systems; education; entropy; learning (artificial intelligence); medical information systems; particle swarm optimisation; support vector machines; SAIL databank; child health data records; children health identification; e-health research; education attainments; entropy regularization; large disparate database systems linking; learning process; linked datasets; medical informatics; particle swarm optimisation; socioeconomic determinants; support vector machine; Couplings; Databases; Educational institutions; Entropy; Kernel; Pediatrics; Support vector machines;
Conference_Titel :
Neural Networks (IJCNN), The 2010 International Joint Conference on
Conference_Location :
Barcelona
Print_ISBN :
978-1-4244-6916-1
DOI :
10.1109/IJCNN.2010.5596973