Title :
Incorporating genetic algorithm into rough feature selection for high dimensional biomedical data
Author :
Dang, Vinh Quoc ; Lam, Chiou-Peng ; Lee, Chang Su
Author_Institution :
Sch. of Comput. & Security Sci., Edith Cowan Univ., Mount Lawley, WA, Australia
Abstract :
In this paper, a hybrid approach incorporating genetic algorithm and rough set theory into Feature Selection is proposed for searching for the best subset of optimal features. The approach utilizes K-means clustering for partitioning attribute values, the rough set-based approach for reducing redundant data, and the genetic algorithm for searching for the best subset of features. A set of six attributes was obtained as the best subset using the proposed algorithm on the colon cancer dataset. Classification was carried out using this set of six attributes with 23 classifiers from WEKA (Waikato Environment for Knowledge Analysis) software to examine their significance to classify unseen test data. In addition, the set of 6 genes found by the proposed approach was also examined for their relevance to known biomarkers in the colon cancer domain.
Keywords :
cancer; data reduction; feature extraction; genetic algorithms; medical administrative data processing; pattern clustering; rough set theory; K-means clustering; WEKA software; Waikato Environment for Knowledge Analysis; biomarkers; colon cancer dataset; feature selection; genetic algorithm; high dimensional biomedical data; redundant data reduction; rough feature selection; rough set theory; Accuracy; Cancer; Classification algorithms; Colon; Convergence; Genetic algorithms; Tuning; feature selection; genetic algorithm; k-means clustering; pattern classification; rough set theory;
Conference_Titel :
IT in Medicine and Education (ITME), 2011 International Symposium on
Conference_Location :
Cuangzhou
Print_ISBN :
978-1-61284-701-6
DOI :
10.1109/ITiME.2011.6132040