• DocumentCode
    2700484
  • Title

    A schema-based approach to building a bioinformatics database federation

  • Author

    Kemp, Graham J L ; Angelopoulos, Nicos ; Gray, Peter M D

  • Author_Institution
    Dept. of Comput. Sci., Aberdeen Univ., UK
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    13
  • Lastpage
    20
  • Abstract
    Developments in our ability to integrate and analyse the data held in existing heterogeneous data resources can lead to an increase in our understanding of biological function at all levels. However, supporting ad-hoc queries across multiple data resources and correlating the data retrieved from these is still difficult. To address this, we are building a mediator based on the functional data model database, P/FDM, which integrates access to heterogeneous, distributed biological databases, while making use of existing search engines and indexes, without infringing on the autonomy of the underlying databases. Central to our design philosophy is the use of schemas. We have adopted a federated architecture with a five-level schema, arising from the use of the ANSI-SPARC three-level schema to describe both the existing autonomous data resources and the mediator itself. We describe the use of mapping functions and list comprehensions in query splitting, producing execution plans, code generation and result fusion. We give an example of cross-database querying involving data held locally in P/FDM systems and external data in the Sequence Retrieval System (SRS)
  • Keywords
    biology computing; distributed databases; query processing; scientific information systems; search engines; software architecture; ANSI-SPARC schema-based approach; P/FDM database; SRS; Sequence Retrieval System; ad-hoc queries; autonomous data resources; bioinformatics database federation; biological function; code generation; cross-database querying; data analysis; data correlation; data integration; database autonomy; distributed biological databases; execution plans; external data; federated architecture; functional data model; heterogeneous data resources; indexes; list comprehensions; locally-held data; mapping functions; mediator; multiple data resources; query splitting; result fusion; search engines; Bioinformatics; Biological information theory; Buildings; Data analysis; Data models; Distributed databases; Fusion power generation; Indexes; Information retrieval; Search engines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bio-Informatics and Biomedical Engineering, 2000. Proceedings. IEEE International Symposium on
  • Conference_Location
    Arlington, VA
  • Print_ISBN
    0-7695-0862-6
  • Type

    conf

  • DOI
    10.1109/BIBE.2000.889584
  • Filename
    889584