• DocumentCode
    2261058
  • Title

    Integrating divergent models for gene mention tagging

  • Author

    Li, Lishuang ; Zhou, Rongpeng ; Huang, Degen ; Liao, Wenping

  • Author_Institution
    Dalian Univ. of Technol., Dalian, China
  • fYear
    2009
  • fDate
    24-27 Sept. 2009
  • Firstpage
    1
  • Lastpage
    7
  • Abstract
    Gene mention tagging is a critical step for biomedical text mining. Only when gene and gene product mentions are correctly identified could other more complex tasks, such as, gene normalization and gene-gene interaction extraction, be performed effectively. In this paper, six divergent models are implemented with different machine learning algorithms and dissimilar feature sets. We integrate these models to further improve the tagging performance. Experiments conducted on the datasets of BioCreative II GM task show that our best performing integration model can achieve an F-score of 87.70%, which outperforms most of the state-of-the-art systems. We also apply CRF++ to see if Kuo et al.´s integration algorithm based on likelihood scores and dictionary-filtering is portable to another CRF package.
  • Keywords
    data mining; genetics; learning (artificial intelligence); medical computing; text analysis; BioCreative II GM; biomedical text mining; dictionary filtering; divergent model; gene mention tagging; gene normalization; gene product; gene-gene interaction extraction; integration model; machine learning; tagging performance; Biomedical computing; Hidden Markov models; Learning systems; Machine learning algorithms; Packaging; Performance analysis; Support vector machines; Tagging; Testing; Text mining; Gene Mention Tagging; Named Entity Recognition; Text Mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Language Processing and Knowledge Engineering, 2009. NLP-KE 2009. International Conference on
  • Conference_Location
    Dalian
  • Print_ISBN
    978-1-4244-4538-7
  • Electronic_ISBN
    978-1-4244-4540-0
  • Type

    conf

  • DOI
    10.1109/NLPKE.2009.5313837
  • Filename
    5313837