DocumentCode
3533312
Title
An integrated approach of sequence and text mining technology for the identification of transcription factor binding sites
Author
Xiong, Yun ; Yang, Qing ; Qiu, Boren ; Zhu, Yangyong
Author_Institution
Sch. of Comput. Sci., Fudan Univ., Shanghai
fYear
2008
fDate
3-5 Nov. 2008
Firstpage
178
Lastpage
184
Abstract
The study of the complex mechanisms that regulated gene expression on the level of transcription is an important and challenging issue in post-genomic era. A crucial step is to identify transcription factor binding sites(TFBSs). However, the number of the known TFBSs is limited, and the accuracy of the state-of-the-art identification methods is still far from satisfactory. In this paper, a novel integrated method for mining transcription factor binding sites is presented, which combines the sequence data mining method with the text mining method. Therefore, the method can not only obtain the putative TFBSs from the sequence data sets, but also acquire the experimentally verified TFBSs from the literatures. To evaluate the performance of our method, several experiments have been tested on real data sets. The results show that our integrated method outperforms each of the algorithms alone, furthermore, exhibits superior accuracy than existing algorithms.
Keywords
data mining; text analysis; sequence data mining; state-of-the-art identification methods; text mining technology; transcription factor binding sites; Bioinformatics; Biological control systems; Biological processes; Computer science; Data mining; Databases; Gene expression; Sequences; Testing; Text mining; binding site; bioinformatics; data mining; sequence mining; text mining; transcription factor;
fLanguage
English
Publisher
ieee
Conference_Titel
Bioinformatics and Biomeidcine Workshops, 2008. BIBMW 2008. IEEE International Conference on
Conference_Location
Philadelphia, PA
Print_ISBN
978-1-4244-2890-8
Type
conf
DOI
10.1109/BIBMW.2008.4686233
Filename
4686233
Link To Document