DocumentCode :
3101758
Title :
Chinese Named Entity Recognition Using a Morpheme-Based Chunking Tagger
Author :
Fu, Guohong
Author_Institution :
Sch. of Comput. Sci. & Technol., Heilongjiang Univ., Harbin, China
fYear :
2009
fDate :
7-9 Dec. 2009
Firstpage :
289
Lastpage :
292
Abstract :
Most previous studies formalize Chinese named entity recognition (NER) as a chunking task with either characters or lexicon words as the basic tokens for chunking. However, it is difficult under this formulation to explore lexical information for NER. Furthermore, traditional NER chunking systems usually employ an exhaustive strategy for entity candidate generation, obviously resulting in efficiency loss during entity decoding. In this paper we propose a morpheme-based chunking framework for Chinese NER and implement an efficient three-stage tagger using the pipeline strategy. To tackle the problem of out-of-vocabulary words and to more effectively explore lexical cues for NER as well, we distinguish named entities from common words and choose morphemes as the basic tokens for entity chunking. To reduce the space of entity candidates and improve the efficiency of entity decoding, we employ internal entity formation pattern rules during entity candidate generation. Our experiments on different datasets show that our system can greatly improve NER efficiency without much degradation of performance.
Keywords :
decoding; information retrieval systems; natural language processing; pattern recognition; Chinese named entity recognition; NER chunking systems; entity candidate generation; entity decoding; lexicon words; morpheme-based chunking tagger; Character recognition; Computer science; Data mining; Decoding; Degradation; Filtering; Natural language processing; Natural languages; Pattern recognition; Pipelines; entity pattern rules; morpheme-based chunking; named entity recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Asian Language Processing, 2009. IALP '09. International Conference on
Conference_Location :
Singapore
Print_ISBN :
978-0-7695-3904-1
Type :
conf
DOI :
10.1109/IALP.2009.68
Filename :
5380751
Link To Document :
بازگشت