Title :
A Semi-supervised Algorithm for Indonesian Named Entity Recognition
Author :
Rezka Aufar Leonandya;Bayu Distiawan;Nursidik Heru Praptono
Author_Institution :
Fac. of Comput. Sci., Univ. Indonesia, Depok, Indonesia
Abstract :
Named Entity Recognition or NER is one of the sub-research field of Information Extraction which can be used for machine translation, question answering, semantic web, etc. One of the biggest challenge of NER is the adversity to construct a manually labeled training data. In this work, we present a semi-supervised approach for Indonesian language NER which is capable of creating high quality training data automatically. Semi-supervised approach works by utilizing unlabeled data made from Wikipedia and DBPedia to form high accuracy and non-redundant additional training data for each iteration of semi-supervised process. We show that our system manages to generate new training data and gain an increasing F1 score as the iteration of semi-supervised process goes.
Keywords :
"Encyclopedias","Electronic publishing","Internet","Training data","Classification algorithms","Testing"
Conference_Titel :
Computational and Business Intelligence (ISCBI), 2015 3rd International Symposium on
DOI :
10.1109/ISCBI.2015.15