Title :
Knowledge discovery in medical and biological datasets using a hybrid Bayes classifier/evolutionary algorithm
Author :
Raymer, Michael L. ; Doom, Travis E. ; Kuhn, Leslie A. ; Punch, William F.
Author_Institution :
Dept. of Comput. Sci. & Eng., Wright State Univ., Dayton, OH, USA
Abstract :
A key element of bioinformatics research is the extraction of meaningful information from large experimental data sets. Various approaches, including statistical and graph theoretical methods, data mining, and computational pattern recognition, have been applied to this task with varying degrees of success. Using a novel classifier based on the Bayes discriminant function, we present a hybrid algorithm that employs feature selection and extraction to isolate salient features from large medical and other biological data sets. We have previously shown that a genetic algorithm coupled with a k-nearest-neighbors classifier performs well in extracting information about protein-water binding from X-ray crystallographic protein structure data. The effectiveness of the hybrid EC-Bayes classifier is demonstrated to distinguish the features of this data set that are the most statistically relevant and to weight these features appropriately to aid in the prediction of solvation sites.
Keywords :
Bayes methods; data mining; evolutionary computation; graph theory; medical expert systems; pattern classification; statistical analysis; Bayes classifier; Bayes discriminant function; GA; X-ray crystallographic protein structure data; bioinformatics research; biological datasets; computational pattern recognition; data mining; evolutionary algorithm; experimental data sets; feature extraction; feature selection; genetic algorithm; graph theoretical methods; hybrid EC-Bayes classifier; information extraction; k-NN classifier; k-nearest-neighbors classifier; knowledge discovery; medical datasets; protein-water binding; statistical methods; Bioinformatics; Biology computing; Computer science; Data mining; Evolutionary computation; Feature extraction; Genetic algorithms; Pattern recognition; Protein engineering; Solvents;
Journal_Title :
Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on
DOI :
10.1109/TSMCB.2003.816922