• DocumentCode
    2849407
  • Title

    A Layered Splice Site Prediction Algorithm Based on Feature Selection and Parameter Optimization

  • Author

    Li, Jing ; Peng, Qinke ; Li, Kankan ; Zhang, Shuwei ; Cheng, Yinzhao

  • Author_Institution
    State Key Lab. for Manuf. Syst. Eng., Xi´´an Jiaotong Univ., Xi´´an, China
  • fYear
    2009
  • fDate
    19-20 Dec. 2009
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    Accurate prediction of splice sites in DNA sequences is a challenging problem in bioinformatics. The splice site prediction still faces many tough challenges, and above all is that it is not clear how many and which features are relevant with the splicing process. So feature selection is often used to improve the prediction accuracy, and it will also provide us with useful biological knowledge. On the other hand, the parameters setting for the classifier always has a significant influence on the classification performance. Hence we used an UMDA-based method which selects the features and optimizes the parameters simultaneously. In addition, most splice sites have remarkable conservative properties and they can be correctly predicted only using conservative signal features around the splice sites, while others which have inconspicuous conservative properties might need some more complex features. Therefore, according to the differences of conservative properties in splice site signal sequences, a layered prediction algorithm based on feature selection and parameter optimization is proposed: UMDA SVM 2 layer algorithm. Our experiment results show that this two-layer algorithm which optimizes features and parameters simultaneously achieved better performance than some current methods.
  • Keywords
    DNA; bioinformatics; optimisation; sequences; support vector machines; DNA sequences; UMDA SVM 2 layer algorithm; UMDA-based method; bioinformatics; biological knowledge; feature selection method; layered splice site prediction algorithm; parameter optimization; splice site signal sequences; support vector machine; univariate marginal distribution algorithm; Accuracy; Bioinformatics; DNA; Genomics; Optimization methods; Prediction algorithms; Predictive models; Sequences; Support vector machine classification; Support vector machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Engineering and Computer Science, 2009. ICIECS 2009. International Conference on
  • Conference_Location
    Wuhan
  • Print_ISBN
    978-1-4244-4994-1
  • Type

    conf

  • DOI
    10.1109/ICIECS.2009.5365284
  • Filename
    5365284