• DocumentCode
    756105
  • Title

    Biological data integration: wrapping data and tools

  • Author

    Lacroix, Zoé

  • Author_Institution
    Arizona State Univ., Tempe, AZ, USA
  • Volume
    6
  • Issue
    2
  • fYear
    2002
  • fDate
    6/1/2002 12:00:00 AM
  • Firstpage
    123
  • Lastpage
    128
  • Abstract
    Scientific data is inevitably digital and stored in a wide variety of formats in heterogeneous systems. Scientists need to access an integrated view of remote or local heterogeneous data sources with advanced data access, analysis, and visualization tools. Building a digital library for scientific data requires accessing and manipulating data extracted from flat files or databases, documents retrieved from the Web as well as data generated by software. We present an approach to wrapping web data sources, databases, flat files, or data generated by tools through a database view mechanism. Generally, a wrapper has two tasks: it first sends a query to the source to retrieve data and, second builds the expected output with respect to the virtual structure. Our wrappers are composed of a retrieval component based on an intermediate object view mechanism called search views mapping the source capabilities to attributes, and an Extensible Markup Language (XML) engine, respectively, to perform these two tasks. The originality of the approach consists of: 1) a generic view mechanism to access seamlessly data sources with limited capabilities and 2) the ability to wrap data sources as well as the useful specific tools they may provide. Our approach has been developed and demonstrated as part of the multidatabase system supporting queries via uniform object protocol model (OPM) interfaces.
  • Keywords
    biology computing; data analysis; digital libraries; distributed databases; hypermedia markup languages; information resources; scientific information systems; Web documents; XML engine; biological data integration; data access; data analysis; data retrieval; data visualization; data wrapping; databases; digital library; flat files; generic view mechanism; heterogeneous data sources; multidatabase system; query; search views; uniform object protocol model interfaces; virtual structure; Access protocols; Data analysis; Data mining; Data visualization; Engines; Information retrieval; Software libraries; Visual databases; Wrapping; XML; Algorithms; Artificial Intelligence; Computational Biology; Computer Communication Networks; Database Management Systems; Databases, Bibliographic; Databases, Factual; Databases, Nucleic Acid; Decision Support Techniques; Feasibility Studies; Information Storage and Retrieval; Internet; MEDLINE; Programming Languages;
  • fLanguage
    English
  • Journal_Title
    Information Technology in Biomedicine, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1089-7771
  • Type

    jour

  • DOI
    10.1109/TITB.2002.1006299
  • Filename
    1006299