Title :
Combined Rule Extraction and Feature Elimination in Supervised Classification
Author :
Sheng Liu ; Patel, R.Y. ; Daga, P.R. ; Haining Liu ; Gang Fu ; Doerksen, R.J. ; Yixin Chen ; Wilkins, D.E.
Author_Institution :
Dept. of Comput. & Inf. Sci., Univ. of Mississippi, Oxford, MS, USA
Abstract :
There are a vast number of biology related research problems involving a combination of multiple sources of data to achieve a better understanding of the underlying problems. It is important to select and interpret the most important information from these sources. Thus it will be beneficial to have a good algorithm to simultaneously extract rules and select features for better interpretation of the predictive model. We propose an efficient algorithm, Combined Rule Extraction and Feature Elimination (CRF), based on 1-norm regularized random forests. CRF simultaneously extracts a small number of rules generated by random forests and selects important features. We applied CRF to several drug activity prediction and microarray data sets. CRF is capable of producing performance comparable with state-of-the-art prediction algorithms using a small number of decision rules. Some of the decision rules are biologically significant.
Keywords :
biology computing; decision trees; drug delivery systems; feature extraction; knowledge based systems; pattern classification; 1-norm regularized random forests; biology related research problems; combined rule extraction; drug activity prediction; feature elimination; feature selection; microarray data sets; predictive model; supervised classification; Accuracy; Decision trees; Encoding; Feature extraction; Prediction algorithms; Radio frequency; Support vector machines; Rule extraction; feature selection; multi-class classification; random forests; Algorithms; Artificial Intelligence; Computational Biology; Databases, Factual; Decision Trees; Humans; Models, Theoretical; Neoplasms; Oligonucleotide Array Sequence Analysis; P-Glycoprotein; Receptors, Cannabinoid; Reproducibility of Results;
Journal_Title :
NanoBioscience, IEEE Transactions on
DOI :
10.1109/TNB.2012.2213264