DocumentCode :
3475520
Title :
BFSM: Finite state machine learned as name boundary definer for bio named entity recognition
Author :
Munkhdalai, Tsendsuren ; Li, Meijing ; Namsrai, E. ; Namsrai, Erdenetuya ; Ryu, Keun Ho
Author_Institution :
Database/Bioinf. Lab., Chungbuk Nat. Univ., Cheongju, South Korea
fYear :
2011
fDate :
27-30 Sept. 2011
Firstpage :
344
Lastpage :
349
Abstract :
One essential task in automated information extraction for biomedical literature is bio named entity recognition process, which basically defines the boundaries between typical words and technical terms of biomedical domain in particular text data and, classifies them based on the domain knowledge. Due to nature of bio named entity, purely defining boundary of the named entities in text data is still challenging. This paper proposes using the part-of-speech tags of tokens as target observation of name boundary definer tool. We proposed an approach for modeling finite state machine as the boundary definer. Aided by machine learning methods including frequent pattern mining method and Bayesian network, the finite state machine learns on part-of-speech tag of tokens in bio-text data. The finite state machine based on Bayesian network is named BFSM. In addition, we report the influence of part-of-speech tagger tool for learning of BFSM. Experimental results show that the named entity recognition system using the BFSM gives us high accuracy as F-score 85.8.
Keywords :
belief networks; data mining; finite state machines; information retrieval; learning (artificial intelligence); medical computing; text analysis; BFSM; BFSM learning; Bayesian network; bio named entity recognition process; bio-text data; biomedical literature; domain knowledge; finite state machine; frequent pattern mining method; information extraction; machine learning methods; name boundary definer tool; token part-of-speech tagger tool; Biological system modeling; Hidden Markov models; Bayesian network; bio named entity recognition; frequent pattern mining; text mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Awareness Science and Technology (iCAST), 2011 3rd International Conference on
Conference_Location :
Dalian
Print_ISBN :
978-1-4577-0887-9
Type :
conf
DOI :
10.1109/ICAwST.2011.6163168
Filename :
6163168
Link To Document :
بازگشت