DocumentCode :
429474
Title :
A software system for gene sequence database construction
Author :
Liu, Z. ; Borneman, J. ; Jiang, T.
Author_Institution :
Dept. of Comput. Sci., California Univ., Riverside, CA, USA
Volume :
1
fYear :
2004
fDate :
1-5 Sept. 2004
Firstpage :
2797
Lastpage :
2800
Abstract :
We propose a Web-based software system for sequence database construction. An example application of this system is to construct a ribosomal RNA gene (rDNA) sequence database to facilitate the study of microbial communities. A fast and accurate approximate string-matching algorithm is implemented to fetch rDNA sequences sandwiched by two given primers from GenBank. A homology search algorithm based on Basic-Local-Alignment-Search-Tool (BLAST) is then used to extract rDNA sequences that do not contain the primers. This two-step process leads to an rDNA sequence database for a specific taxonomic group. We consider the distance between two given primers, mismatches and degeneracy when performing string matching. In the homology search, a chaining algorithm is combined with BLAST to obtain global alignments based on local alignments. This system can be used in many biological applications.
Keywords :
DNA; Internet; biological techniques; biology computing; database management systems; genetics; macromolecules; microorganisms; molecular biophysics; string matching; Basic Local Alignment Search Tool; GenBank; Web-based software system; biological applications; chaining algorithm; gene sequence database construction; homology search algorithm; microbial community; primers; rDNA sequence extraction; ribosomal RNA gene sequence database; string-matching algorithm; taxonomy; Application software; Cloning; DNA; Databases; Fingerprint recognition; Genomics; Organisms; RNA; Sequences; Software systems; Oligonucleotide fingerprinting; approximate string-matching; homology search;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Engineering in Medicine and Biology Society, 2004. IEMBS '04. 26th Annual International Conference of the IEEE
Conference_Location :
San Francisco, CA
Print_ISBN :
0-7803-8439-3
Type :
conf
DOI :
10.1109/IEMBS.2004.1403799
Filename :
1403799
Link To Document :
بازگشت