DocumentCode :
2849407
Title :
A Layered Splice Site Prediction Algorithm Based on Feature Selection and Parameter Optimization
Author :
Li, Jing ; Peng, Qinke ; Li, Kankan ; Zhang, Shuwei ; Cheng, Yinzhao
Author_Institution :
State Key Lab. for Manuf. Syst. Eng., Xi´´an Jiaotong Univ., Xi´´an, China
fYear :
2009
fDate :
19-20 Dec. 2009
Firstpage :
1
Lastpage :
5
Abstract :
Accurate prediction of splice sites in DNA sequences is a challenging problem in bioinformatics. The splice site prediction still faces many tough challenges, and above all is that it is not clear how many and which features are relevant with the splicing process. So feature selection is often used to improve the prediction accuracy, and it will also provide us with useful biological knowledge. On the other hand, the parameters setting for the classifier always has a significant influence on the classification performance. Hence we used an UMDA-based method which selects the features and optimizes the parameters simultaneously. In addition, most splice sites have remarkable conservative properties and they can be correctly predicted only using conservative signal features around the splice sites, while others which have inconspicuous conservative properties might need some more complex features. Therefore, according to the differences of conservative properties in splice site signal sequences, a layered prediction algorithm based on feature selection and parameter optimization is proposed: UMDA SVM 2 layer algorithm. Our experiment results show that this two-layer algorithm which optimizes features and parameters simultaneously achieved better performance than some current methods.
Keywords :
DNA; bioinformatics; optimisation; sequences; support vector machines; DNA sequences; UMDA SVM 2 layer algorithm; UMDA-based method; bioinformatics; biological knowledge; feature selection method; layered splice site prediction algorithm; parameter optimization; splice site signal sequences; support vector machine; univariate marginal distribution algorithm; Accuracy; Bioinformatics; DNA; Genomics; Optimization methods; Prediction algorithms; Predictive models; Sequences; Support vector machine classification; Support vector machines;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Engineering and Computer Science, 2009. ICIECS 2009. International Conference on
Conference_Location :
Wuhan
Print_ISBN :
978-1-4244-4994-1
Type :
conf
DOI :
10.1109/ICIECS.2009.5365284
Filename :
5365284
Link To Document :
بازگشت