• DocumentCode
    2549246
  • Title

    Sequence memoizer based model for Biomedical Named Entity Recognition

  • Author

    Sun, Yaming ; Sun, Chengjie ; Lin, Lei ; Liu, Ming ; Wang, Xiaolong

  • Author_Institution
    Sch. of Comput. Sci. & Technol., Harbin Inst. of Technol., Harbin, China
  • fYear
    2012
  • fDate
    29-31 May 2012
  • Firstpage
    1128
  • Lastpage
    1132
  • Abstract
    Biomedical Named Entity Recognition (Bio-NER) plays a fundamental role in biomedical text mining. The improvement of the Bio-NER module´s performance always brings notable enhancement to the entire text mining system. Treating the Bio-NER task as a sequence labeling problem, this paper proposes a generative approach to address it. By adopting the spirit of Sequence Memoizer, our approach gives reasonable named entity labels to the elements in the given sequences, based on the better modeling of the context word distributions. Compared with the traditional generative models like Hidden Markov Model, our approach works better on capturing the power-law scaling and the long range dependencies of natural language. Our method is evaluated on the JNLPBA 2004 dataset and compared with the classic generative and discriminative approaches. The experimental results have shown that it outperforms the HMM model and is comparable with the Maxent model.
  • Keywords
    data mining; medical computing; natural language processing; pattern recognition; text analysis; JNLPBA 2004 dataset; biomedical named entity recognition; biomedical text mining; context word distribution modeling; long range dependency; natural language; power-law scaling; sequence labeling problem; sequence memoizer based model; Biological system modeling; Computational modeling; Context; Data models; Hidden Markov models; Training; Training data; Sequence Memoizer; biomedical named entity recognition; generative model;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems and Knowledge Discovery (FSKD), 2012 9th International Conference on
  • Conference_Location
    Sichuan
  • Print_ISBN
    978-1-4673-0025-4
  • Type

    conf

  • DOI
    10.1109/FSKD.2012.6234154
  • Filename
    6234154