Title :
Ensemble based active annotation for named entity recognition
Author :
Ekbal, Asif ; Saha, Simanto ; Singh, D.
Author_Institution :
Dept. of Comput. Sci. & Eng., Indian Inst. of Technol. Patna, Patna, India
fDate :
Nov. 30 2012-Dec. 1 2012
Abstract :
Active Learning is an important prospect of machine learning for information extraction to deal with the problems of high cost of collecting labeled examples. It makes more efficient use of the learner´s time by asking them to label only instances that are most useful for the trainer. We propose a novel method for solving this problem and show that it favorably results in the increased performance. Our proposed framework is based on an ensemble approach, where Support Vector Machine and Conditional Random Field are used as the base learners. The intuition is that both learning approaches are somewhat orthogonal in their advantages, so a combination of them can yield superior results. The proposed approach is applied for solving the problem of named entity recognition (NER) in two Indian language, namely Hindi and Bengali. Results show that the proposed technique indeed improves the performance of the system.
Keywords :
information retrieval; learning (artificial intelligence); natural language processing; support vector machines; Bengali language; Hindi language; Indian language; NER; active learning; conditional random field; ensemble based active annotation; information extraction; machine learning; named entity recognition; support vector machine; Context; Feature extraction; Machine learning; Support vector machines; Training; Training data; Vectors;
Conference_Titel :
Emerging Applications of Information Technology (EAIT), 2012 Third International Conference on
Conference_Location :
Kolkata
Print_ISBN :
978-1-4673-1828-0
DOI :
10.1109/EAIT.2012.6407942