DocumentCode :
1989751
Title :
Tracing Lineage in Multi-version Scientific Databases
Author :
Zhang, Mingwu ; Kihara, Daisuke ; Prabhakar, Sunil
Author_Institution :
Purdue Univ. West Lafayette, West Lafayette
fYear :
2007
fDate :
14-17 Oct. 2007
Firstpage :
440
Lastpage :
447
Abstract :
The critical need for better tracing of lineage in scientific databases is well known. It is clear that performance is not an issue for most domain scientists - rather the functionality is more important. In this paper, we highlight the importance of maintaining multiple versions of data and tracing fine-grained lineage in support of these needs. We study alternatives for managing versions, and propose a model for the example application of protein annotations. We present query rewriting algorithms for SPJ and ASP J queries that piggy-back lineage computation with query evaluation. Our models are implemented using PostgreSQL and tested using a large, real dataset from Uniprot. We establish the validity of the approach in enabling relevant queries and study the space and time overheads. While these overheads can be high in some cases, the real gain for scientists is the novel functionality that can allow them to ascertain reliability of derived data, and foster data-driven research. To the best of our knowledge, this is the first work that can handle these types of queries for lineage tracing.
Keywords :
SQL; biology computing; molecular biophysics; proteins; ASPJ queries; PostgreSQL; SPJ queries; Uniprot; fine-grained lineage; lineage tracing; multiversion scientific databases; piggy-back lineage computation; protein annotations; query evaluation; query rewriting algorithms; Bioinformatics; Computer science; Database systems; Hidden Markov models; Knowledge management; Maintenance; Proteins; Query processing; Technology management; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Bioinformatics and Bioengineering, 2007. BIBE 2007. Proceedings of the 7th IEEE International Conference on
Conference_Location :
Boston, MA
Print_ISBN :
978-1-4244-1509-0
Type :
conf
DOI :
10.1109/BIBE.2007.4375599
Filename :
4375599
Link To Document :
بازگشت