• DocumentCode
    3199966
  • Title

    Information extraction from nanotoxicity related publications

  • Author

    Lemin Xiao ; Kaizhi Tang ; Xiong Liu ; Hui Yang ; Zheng Chen ; Xu, Ruimin

  • Author_Institution
    Intell. Autom. Inc., Rockville, MD, USA
  • fYear
    2013
  • fDate
    18-21 Dec. 2013
  • Firstpage
    25
  • Lastpage
    30
  • Abstract
    High-quality experimental data are important when developing predictive models for studying nanomaterial environmental impact (NEI). Given that raw data from experimental laboratories and manufacturing workplaces are usually proprietary and small-scaled, extracting information from publications is an attractive alternative for collecting data. We developed an information extraction system that can extract useful information from full-text nanotoxicity related publications. This information extraction system consists of five components: raw data transformation into machine readable format, data preprocessing, ontology-based named entity recognition, rule-based numerical attribute extraction from both tables and unstructured text, and relation extraction among entities and attributes. The information extraction system is applied on a dataset made of 94 publications, and results in an acceptable accuracy. By storing extracted data into a table according to relations among the data, a dataset that can be used to predict nanomaterial environmental impact is obtained. Such a system is unique in current nanomaterial community, and can help nanomaterial scientists and practitioners quickly locate useful information they need without spending lots of time reading articles.
  • Keywords
    data mining; medical computing; nanomedicine; numerical analysis; toxicology; data preprocessing; full-text nanotoxicity; information extraction system; machine readable format; nanomaterial community; nanomaterial environmental impact; nanotoxicity related publications; ontology-based named entity recognition; predictive models; raw data transformation; rule-based numerical attribute extraction; Data mining; Information retrieval; Nanoparticles; Ontologies; Pattern matching; Shape; XML; Nanoinformatics; data mining; information extraction; named entity recognition; nanotoxicity; relation extraction;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Biomedicine (BIBM), 2013 IEEE International Conference on
  • Conference_Location
    Shanghai
  • Type

    conf

  • DOI
    10.1109/BIBM.2013.6732723
  • Filename
    6732723