• DocumentCode
    3142031
  • Title

    A framework for semi-automatic identification, disambiguation and storage of protein-related abbreviations in scientific literature

  • Author

    Atzeni, Paolo ; Polticelli, Fabio ; Toti, Daniele

  • Author_Institution
    Dipt. di Inf. e Autom., Univ. Roma Tre, Rome, Italy
  • fYear
    2011
  • fDate
    11-16 April 2011
  • Firstpage
    59
  • Lastpage
    61
  • Abstract
    We propose a framework for identifying, disambiguating and storing protein-related abbreviations as found in the full texts of scientific papers, in order to build and maintain a publicly available abbreviation repository via a semi-automatic process. This process involves information extraction methods and techniques for acronym identification and resolution, based on lexical clues and syntactical, largely domain-independent criteria. A dictionary and an ontology for proteins provide the means for matching and disambiguating the biological entities. User feedback is gathered at the end of the process and the confirmed entries are then stored and made available to the scientific community for further reviewing.
  • Keywords
    biology computing; dictionaries; information retrieval; ontologies (artificial intelligence); proteins; abbreviation repository; acronym identification; acronym resolution; biological entities; dictionary; information extraction methods; lexical clues; ontology; protein-related abbreviation disambiguation; protein-related abbreviation semi-automatic identification; protein-related abbreviation storage; scientific literature; user feedback; Bioinformatics; Communities; Data mining; Dictionaries; Natural language processing; Proteins;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering Workshops (ICDEW), 2011 IEEE 27th International Conference on
  • Conference_Location
    Hannover
  • Print_ISBN
    978-1-4244-9195-7
  • Electronic_ISBN
    978-1-4244-9194-0
  • Type

    conf

  • DOI
    10.1109/ICDEW.2011.5767646
  • Filename
    5767646