DocumentCode
2849407
Title
A Layered Splice Site Prediction Algorithm Based on Feature Selection and Parameter Optimization
Author
Li, Jing ; Peng, Qinke ; Li, Kankan ; Zhang, Shuwei ; Cheng, Yinzhao
Author_Institution
State Key Lab. for Manuf. Syst. Eng., Xi´´an Jiaotong Univ., Xi´´an, China
fYear
2009
fDate
19-20 Dec. 2009
Firstpage
1
Lastpage
5
Abstract
Accurate prediction of splice sites in DNA sequences is a challenging problem in bioinformatics. The splice site prediction still faces many tough challenges, and above all is that it is not clear how many and which features are relevant with the splicing process. So feature selection is often used to improve the prediction accuracy, and it will also provide us with useful biological knowledge. On the other hand, the parameters setting for the classifier always has a significant influence on the classification performance. Hence we used an UMDA-based method which selects the features and optimizes the parameters simultaneously. In addition, most splice sites have remarkable conservative properties and they can be correctly predicted only using conservative signal features around the splice sites, while others which have inconspicuous conservative properties might need some more complex features. Therefore, according to the differences of conservative properties in splice site signal sequences, a layered prediction algorithm based on feature selection and parameter optimization is proposed: UMDA SVM 2 layer algorithm. Our experiment results show that this two-layer algorithm which optimizes features and parameters simultaneously achieved better performance than some current methods.
Keywords
DNA; bioinformatics; optimisation; sequences; support vector machines; DNA sequences; UMDA SVM 2 layer algorithm; UMDA-based method; bioinformatics; biological knowledge; feature selection method; layered splice site prediction algorithm; parameter optimization; splice site signal sequences; support vector machine; univariate marginal distribution algorithm; Accuracy; Bioinformatics; DNA; Genomics; Optimization methods; Prediction algorithms; Predictive models; Sequences; Support vector machine classification; Support vector machines;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Engineering and Computer Science, 2009. ICIECS 2009. International Conference on
Conference_Location
Wuhan
Print_ISBN
978-1-4244-4994-1
Type
conf
DOI
10.1109/ICIECS.2009.5365284
Filename
5365284
Link To Document