DocumentCode :
3142368
Title :
Classifying Wikipedia entities into fine-grained classes
Author :
Tkatchenko, Maksim ; Ulanov, Alexander ; Simanovsky, Andrey
Author_Institution :
HP Labs. Russia, St. Petersburg, Russia
fYear :
2011
fDate :
11-16 April 2011
Firstpage :
212
Lastpage :
217
Abstract :
Recognition of named entities (people, companies, locations, etc) is an essential task of text analytics. We address the subproblem of this task, namely, named entity classification. We propose a novel approach that constructs an effective fine-grained named entity classifier. Its key highlights are semi-automatic training set construction from Wikipedia articles and additional feature selection. We justify our solution by creating 18-class classifier and demonstrating its effectiveness and efficiency.
Keywords :
Internet; encyclopaedias; pattern classification; text analysis; Wikipedia articles; Wikipedia entities classification; feature selection; fine-grained named entity classifier; named entities recognition; semi-automatic training set construction; text analytics; Accuracy; Electronic publishing; Encyclopedias; Internet; Support vector machines; Training;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering Workshops (ICDEW), 2011 IEEE 27th International Conference on
Conference_Location :
Hannover
Print_ISBN :
978-1-4244-9195-7
Electronic_ISBN :
978-1-4244-9194-0
Type :
conf
DOI :
10.1109/ICDEW.2011.5767662
Filename :
5767662
Link To Document :
بازگشت