DocumentCode :
2378412
Title :
Relational database index choices for genome annotation data
Author :
Karpenko, Oleksiy ; Dai, Yang
Author_Institution :
Dept. of Bioeng., Univ. of Illinois at Chicago, Chicago, IL, USA
fYear :
2010
fDate :
18-18 Dec. 2010
Firstpage :
264
Lastpage :
268
Abstract :
Latest genomics techniques such as ChIP-chip, ChIP-seq, and RNA-seq have induced an exponential growth in the volume of genome annotations. The effective mining of these data is an indispensible step in systems biology. In this work we consider conventional B-tree, R-tree and no index options for annotation data in MySQL, Oracle, and PostgreSQL databases. We validate theoretical considerations of applicability of different indexes by computational experiments with gene, repeat masker, expressed sequence tag, and single nucleotide polymorphism annotations of Homo sapiens chromosome 22. The running times for distance and overlap queries suggest that, with the exception of PostgreSQL B-tree for overlap queries, R-trees are superior to B-trees for indexing annotations, and thus may be a good choice for genome annotation databases.
Keywords :
SQL; bioinformatics; data mining; genomics; query processing; relational databases; B-tree option; ChIP-chip; ChIP-seq; Homo sapiens chromosome 22; MySQL database; Oracle database; PostgreSQL database; R-tree option; RNA-seq; expressed sequence tag annotations; gene annotations; genome annotation data mining; genome annotation databases; genomics techniques; no index option; overlap queries; relational database index choices; repeat masker annotations; single nucleotide polymorphism annotations; systems biology; data mining; genome annotation; spatial index;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Bioinformatics and Biomedicine Workshops (BIBMW), 2010 IEEE International Conference on
Conference_Location :
Hong, Kong
Print_ISBN :
978-1-4244-8303-7
Electronic_ISBN :
978-1-4244-8304-4
Type :
conf
DOI :
10.1109/BIBMW.2010.5703810
Filename :
5703810
Link To Document :
بازگشت