DocumentCode :
3630274
Title :
A genetic algorithm for logical topic text segmentation
Author :
Alin Mihaila;Andreea Mihis;Cristina Mihaila
Author_Institution :
Babe?-Bolyai University, Cluj-Napoca, Romania
fYear :
2008
Firstpage :
500
Lastpage :
505
Abstract :
Topic text segmentation is an important problem in information retrieval and summarization. The segmentation process tries to split a text into thematic clusters (segments) in such a way that every cluster has a high cohesion and the contiguous clusters are connected as little as possible. The originality of this work is twofold. First, we propose new segmentation criteria based on text entailment for interpreting the cohesion and connectivity of segments and second, we use a genetic algorithm which uses a measure based on text entailment for determining the topic boundaries, in order to identify a predefined number of segments. The obtained results are compared with against two manually segmented texts.
Keywords :
"Genetic algorithms","Information retrieval","Context modeling","Frequency","Dynamic programming","Decision trees","Proposals","Trade agreements","Natural languages"
Publisher :
ieee
Conference_Titel :
Digital Information Management, 2008. ICDIM 2008. Third International Conference on
Print_ISBN :
978-1-4244-2916-5
Type :
conf
DOI :
10.1109/ICDIM.2008.4746783
Filename :
4746783
Link To Document :
بازگشت