DocumentCode :
2349142
Title :
iSentenizer: An incremental sentence boundary classifier
Author :
Wong, Fai ; Chao, Sam
Author_Institution :
CIS, Univ. of Macau, Macau, China
fYear :
2010
fDate :
21-23 Aug. 2010
Firstpage :
1
Lastpage :
7
Abstract :
In this paper, we revisited the topic of sentence boundary detection, and proposed an incremental approach to tackle the problem. The boundary classifier is revised on the fly to adapt to the text of high variety of sources and genres. We applied i+Learning, an incremental algorithm, for constructing the sentence boundary detection model using different features based on local context. Although the model can be easily trained on any genre of text and on any alphabet language, we emphasize the ability that the classifier is adaptable to text with domain and topic shifts without retraining the whole model from scratch. Empirical results indicate that the performance of proposed system is comparable to that of similar systems.
Keywords :
computational linguistics; learning (artificial intelligence); pattern classification; text analysis; alphabet language; i+Learning; iSentenizer; incremental sentence boundary classifier; sentence boundary detection; Artificial neural networks; Tagging; Variable speed drives; i+Learning; incremental learning; sentence boundary detection; tagging;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Natural Language Processing and Knowledge Engineering (NLP-KE), 2010 International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-6896-6
Type :
conf
DOI :
10.1109/NLPKE.2010.5587856
Filename :
5587856
Link To Document :
بازگشت