DocumentCode :
244700
Title :
Title named entity recognition using wikipedia and abbreviation generation
Author :
Youngmin Park ; Sangwoo Kang ; Jungyun Seo
Author_Institution :
Dept. of Comput. Sci. & Eng., Sogang Univ., Seoul, South Korea
fYear :
2014
fDate :
15-17 Jan. 2014
Firstpage :
169
Lastpage :
172
Abstract :
In this paper, we propose a title named entity recognition model using Wikipedia and abbreviation generation. The proposed title named entity recognition model automatically extracts title named entities from Wikipedia so constant renewal is possible without additional costs. Also, in order to establish a dictionary of title named entity abbreviations, generation rules are used to generate abbreviation candidates and abbreviations are selected through web search methods. In this paper, we propose a statistical model that recognizes title named entities using CRFs (Conditional Random Fields). The proposed model uses lexical information, a named entity dictionary, and an abbreviation dictionary, and provides title named entity recognition performance of 82.1% according to experimental results.
Keywords :
Web sites; information retrieval; random processes; statistical analysis; text analysis; CRF; Web search methods; Wikipedia; abbreviation generation; automatic title named entity extraction; conditional random fields; generation rules; lexical information; statistical model; title named entity abbreviation dictionary; title named entity recognition; title named entity recognition model; Dictionaries; Educational institutions; Electronic publishing; Encyclopedias; Internet; Syntactics; Abbreviation generation; Conditional random field; Title named entity; Wikipedia;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Big Data and Smart Computing (BIGCOMP), 2014 International Conference on
Conference_Location :
Bangkok
Type :
conf
DOI :
10.1109/BIGCOMP.2014.6741430
Filename :
6741430
Link To Document :
بازگشت