مرکز منطقه ای اطلاع رساني علوم و فناوري - Mining sequence features for DNA-binding site prediction

DocumentCode :

3394669

Title :

Mining sequence features for DNA-binding site prediction

Author :

Hu, Jing ; Yan, Changhui

Author_Institution :

Dept. of Comput. Sci., Utah State Univ., Logan, UT

fYear :

2008

fDate :

15-17 Sept. 2008

Firstpage :

219

Lastpage :

222

Abstract :

Protein-DNA interactions play pivotal roles in gene regulation and DNA replication and repair. Since the 3-dimensional structure of most proteins is still unknown, computational methods which can identify DNA-binding sites from protein sequences are in demand. In this study, we used a greedy method to search for features that are useful for the identification of DNA-binding sites. 5 features were selected from a pool of 534 features. Using the selected 5 features, a Naive Bayes method achieved 0.31 Matthews correlation coefficient (MCC), which is an improvement over a previous method that used only 2 features as input. Since all of the 5 features can be derived from protein sequences, the proposed method can identify DNA-binding sites using only protein sequences as input.

Keywords :

DNA; biology computing; data mining; genetics; molecular biophysics; 3-dimensional structure; DNA repair; DNA replication; DNA-binding site prediction; Matthews correlation coefficient; Naive Bayes method; gene regulation; mining sequence; protein-DNA interactions; proteins; Amino acids; DNA computing; Electrostatics; Entropy; Genetics; Neural networks; Proteins; Sequences; Shape; Solvents;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Computational Intelligence in Bioinformatics and Computational Biology, 2008. CIBCB '08. IEEE Symposium on

Conference_Location :

Sun Valley, ID

Print_ISBN :

978-1-4244-1778-0

Electronic_ISBN :

978-1-4244-1779-7

Type :

conf

DOI :

10.1109/CIBCB.2008.4675782

Filename :

4675782

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3394669