• DocumentCode
    2096073
  • Title

    Integrating Biomedical Publications with Existing Metadata

  • Author

    Nikolov, Nikolay ; Stoehr, Peter

  • Author_Institution
    Eur. Bioinf. Inst., Poznan
  • fYear
    2008
  • fDate
    17-19 June 2008
  • Firstpage
    653
  • Lastpage
    655
  • Abstract
    Currently biomedical literature is largely disconnected from its metadata. While there are freely accessible centralised metadata repositories the publications themselves are split among a large number of repositories. We address this problem by harvesting freely accessible biomedical publications from the Web and integrating them with the corresponding metadata. The system involves title recognition applied on the harvested publications using knowledge-based algorithm and a fuzzy match between the extracted title and the metadata records using edit distance metric. So far we were able to locate +300.000 publications on the Web and achieve +96% precision and nearly 85% recall on a random sample of 250 documents harvested from the Web.
  • Keywords
    Internet; electronic publishing; fuzzy set theory; knowledge based systems; medical computing; meta data; World Wide Web; biomedical literature; biomedical publication; fuzzy match; knowledge-based algorithm; metadata; Bioinformatics; Biomedical computing; Fuzzy systems; HTML; Indexing; Information retrieval; Text mining; Uniform resource locators; Web services; XML; data integration;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer-Based Medical Systems, 2008. CBMS '08. 21st IEEE International Symposium on
  • Conference_Location
    Jyvaskyla
  • ISSN
    1063-7125
  • Print_ISBN
    978-0-7695-3165-6
  • Type

    conf

  • DOI
    10.1109/CBMS.2008.127
  • Filename
    4562076