Title :
Undiagnosed samples aided rough set feature selection for medical data
Author :
Donghai Guan ; Weiwei Yuan ; Zilong Jin ; Sungyoung Lee
Author_Institution :
Coll. of Autom., Harbin Eng. Univ., Harbin, China
Abstract :
Medical data often consists of a large number of disease markers. For medical data analysis, some disease markers are not helpful and sometimes even have negative effects. Therefore, applying feature selection is necessary as it can remove those unimportant disease markers. Among many feature selection methods, rough set based feature selection (RSFS) has been widely used. Unlike other methods, RSFS is completely data-driven. It does not require any other information like probability distributions. Traditional RSFS methods extract the information only from the diagnosed samples. Therefore, they usually require a large number of diagnosed samples to achieve the good feature selection performance. However, in many real medical applications, diagnosed samples are limited, yet the number of undiagnosed samples is large. Motivated by semi-supervised learning methodology, in this paper, we propose a novel RSFS method which can learn from both diagnosed and undiagnosed samples. This method is called undiagnosed samples aided rough set feature selection (USA-RSFS). Its main benefit is to reduce the requirement on diagnosed samples by the help of undiagnosed ones. Finally, the promising performance of USA-RSFS is validated through a set of experiments on medical datasets.
Keywords :
data analysis; learning (artificial intelligence); medical administrative data processing; patient diagnosis; rough set theory; statistical distributions; USA-RSFS; disease markers; medical data analysis; medical datasets; probability distributions; rough set based feature selection; semi-supervised learning methodology; undiagnosed samples aided rough set feature selection; Medical diagnostic imaging; feature selection; rough set; semi-supervised learning; undiagnosed samples;
Conference_Titel :
Parallel Distributed and Grid Computing (PDGC), 2012 2nd IEEE International Conference on
Conference_Location :
Solan
Print_ISBN :
978-1-4673-2922-4
DOI :
10.1109/PDGC.2012.6449895