DocumentCode :
2503853
Title :
Feature Selection Using Multiobjective Optimization for Named Entity Recognition
Author :
Ekbal, Asif ; Saha, Sriparna ; Garbe, Christoph S.
Author_Institution :
Dept. of Comput. Linguistics, Heidelberg Univ., Heidelberg, Germany
fYear :
2010
fDate :
23-26 Aug. 2010
Firstpage :
1937
Lastpage :
1940
Abstract :
Appropriate feature selection is a very crucial issue in any machine learning framework, specially in Maximum Entropy (ME). In this paper, the selection of appropriate features for constructing a ME based Named Entity Recognition (NER) system is posed as a multiobjective optimization (MOO) problem. Two classification quality measures, namely recall and precision are simultaneously optimized using the search capability of a popular evolutionary MOO technique, NSGA-II. The proposed technique is evaluated to determine suitable feature combinations for NER in two languages, namely Bengali and English that have significantly different characteristics. Evaluation results yield the recall, precision and F-measure values of 70.76%, 81.88% and 75.91%, respectively for Bengali, and 78.38%, 81.27% and 79.80%, respectively for English. Comparison with an existing ME based NER system shows that our proposed feature selection technique is more efficient than the heuristic based feature selection.
Keywords :
feature extraction; image recognition; learning (artificial intelligence); optimisation; NSGA-II; classification quality measures; evolutionary MOO technique; feature selection technique; heuristic based feature selection; maximum entropy; multiobjective optimization; named entity recognition; Biological cells; Context; Entropy; Machine learning; Optimization; Training; Training data; Feature Selection; Maximum Entropy; Multiobjective Optimization; Named Entity Recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition (ICPR), 2010 20th International Conference on
Conference_Location :
Istanbul
ISSN :
1051-4651
Print_ISBN :
978-1-4244-7542-1
Type :
conf
DOI :
10.1109/ICPR.2010.477
Filename :
5597245
Link To Document :
بازگشت