DocumentCode :
2418162
Title :
Curated databases
Author :
Buneman, Peter
Author_Institution :
Edinburgh Univ., UK
fYear :
2003
fDate :
10-12 Dec. 2003
Firstpage :
13
Abstract :
Summary form only given. Scientists, notably biologists, are making increasing use of databases to publish both their data and their interpretation of data. These databases are valuable because of the human effort (curation) that goes into their construction and maintenance. They typically consist of a mixture of source data, metadata, annotations, and relevant data that has been extracted from other curated databases. Current database and data exchange technology does not serve database curation well. In this paper, the author addresses a number of issues connected with curated databases. Annotation of existing data now provides a new form of communication between scientists, but conventional database technology provides little support for attaching annotations. The author shows why new models of both data and query languages are needed. Closely related to annotation is provenance - archiving - is also important for verifying the basis of scientific research, yet few published scientific databases do a good job of archiving. Past "editions" of the database get lost. The author describes a system that allows frequent archiving and efficient retrieval with remarkably little space overhead. Finally the author argues that we need a new model of how curated databases are constructed. The idea that such databases are constructed as views of other data through conventional query and update languages is unhelpful, and that formulation of a "copy-and-paste" model of data construction may provide us with better curation tools.
Keywords :
database management systems; information retrieval systems; query languages; query processing; annotations; archiving; copy-and-paste model; curated databases; curation tools; data construction; data exchange technology; data extraction; data interpretation; data publishing; database technology; information retrieval; metadata; query languages; relevant data; scientific databases; source data; Bioinformatics; Data mining; Database languages; Humans; Information systems; Joining processes; Systems engineering and theory;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Information Systems Engineering, 2003. WISE 2003. Proceedings of the Fourth International Conference on
Print_ISBN :
0-7695-1999-7
Type :
conf
DOI :
10.1109/WISE.2003.1254462
Filename :
1254462
Link To Document :
بازگشت