DocumentCode :
856368
Title :
Rule generation for protein secondary structure prediction with support vector machines and decision tree
Author :
He, Jieyue ; Hu, Hae-Jin ; Harrison, Robert ; Tai, Phang C. ; Pan, Yi
Author_Institution :
Dept. of Comput. Sci. & Eng., Nanjing Univ.
Volume :
5
Issue :
1
fYear :
2006
fDate :
3/1/2006 12:00:00 AM
Firstpage :
46
Lastpage :
53
Abstract :
Support vector machines (SVMs) have shown strong generalization ability in a number of application areas, including protein structure prediction. However, the poor comprehensibility hinders the success of the SVM for protein structure prediction. The explanation of how a decision made is important for accepting the machine learning technology, especially for applications such as bioinformatics. The reasonable interpretation is not only useful to guide the "wet experiments," but also the extracted rules are helpful to integrate computational intelligence with symbolic AI systems for advanced deduction. On the other hand, a decision tree has good comprehensibility. In this paper, a novel approach to rule generation for protein secondary structure prediction by integrating merits of both the SVM and decision tree is presented. This approach combines the SVM with decision tree into a new algorithm called SVM_DT, which proceeds in three steps. This algorithm first trains an SVM. Then, a new training set is generated through careful selection from the output of the SVM. Finally, the obtained training set is used to train a decision tree learning system and to extract the corresponding rule sets. The results of the experiments of protein secondary structure prediction on RS126 data set show that the comprehensibility of SVM_DT is much better than that of the SVM. Moreover, the generalization ability of SVM_DT is better than that of C4.5 decision trees and is similar to that of the SVM. Hence, SVM_DT can be used not only for prediction, but also for guiding biological experiments
Keywords :
biology computing; decision trees; learning (artificial intelligence); molecular biophysics; molecular configurations; proteins; support vector machines; AI; SVM_DT; bioinformatics; comprehensibility; computational intelligence; decision tree; machine learning; protein secondary structure prediction; rule generation; support vector machines; Bioinformatics; Computer science; Decision trees; Helium; Machine learning; Pattern recognition; Proteins; Scholarships; Support vector machine classification; Support vector machines; Decision tree; protein structure; rule extraction; support vector machine (SVM);
fLanguage :
English
Journal_Title :
NanoBioscience, IEEE Transactions on
Publisher :
ieee
ISSN :
1536-1241
Type :
jour
DOI :
10.1109/TNB.2005.864021
Filename :
1603533
Link To Document :
بازگشت