DocumentCode :
2549246
Title :
Sequence memoizer based model for Biomedical Named Entity Recognition
Author :
Sun, Yaming ; Sun, Chengjie ; Lin, Lei ; Liu, Ming ; Wang, Xiaolong
Author_Institution :
Sch. of Comput. Sci. & Technol., Harbin Inst. of Technol., Harbin, China
fYear :
2012
fDate :
29-31 May 2012
Firstpage :
1128
Lastpage :
1132
Abstract :
Biomedical Named Entity Recognition (Bio-NER) plays a fundamental role in biomedical text mining. The improvement of the Bio-NER module´s performance always brings notable enhancement to the entire text mining system. Treating the Bio-NER task as a sequence labeling problem, this paper proposes a generative approach to address it. By adopting the spirit of Sequence Memoizer, our approach gives reasonable named entity labels to the elements in the given sequences, based on the better modeling of the context word distributions. Compared with the traditional generative models like Hidden Markov Model, our approach works better on capturing the power-law scaling and the long range dependencies of natural language. Our method is evaluated on the JNLPBA 2004 dataset and compared with the classic generative and discriminative approaches. The experimental results have shown that it outperforms the HMM model and is comparable with the Maxent model.
Keywords :
data mining; medical computing; natural language processing; pattern recognition; text analysis; JNLPBA 2004 dataset; biomedical named entity recognition; biomedical text mining; context word distribution modeling; long range dependency; natural language; power-law scaling; sequence labeling problem; sequence memoizer based model; Biological system modeling; Computational modeling; Context; Data models; Hidden Markov models; Training; Training data; Sequence Memoizer; biomedical named entity recognition; generative model;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2012 9th International Conference on
Conference_Location :
Sichuan
Print_ISBN :
978-1-4673-0025-4
Type :
conf
DOI :
10.1109/FSKD.2012.6234154
Filename :
6234154
Link To Document :
بازگشت