Abstract :
Based on enzyme sequence information and predicted secondary structure information as feature parameters, by using support vector machine (SVM), a novel method for identifying the ¦Â-hairpin motifs in enzymes is proposed. The method is trained and tested on an enzymes database of 4030 ¦Â-hairpins and 1780 non-¦Â-hairpins. For training dataset in 5-fold cross-validation, the overall accuracy is 91.00%, Matthew´s correlation coefficient (MCC) is 0.79, and for testing dataset in independent test, the overall accuracy is 88.93%, MCC is 0.76. In addition, this method has been further used to predict 1345 ¦Â-hairpins which contain ligand binding sites. For training dataset in 5-fold cross-validation and for testing dataset in independent test, the overall accuracy reach 89.28% and 88.79%, MCC are 0.77 and 0.74, respectively.