Title :
Lexical Chains Segmentation in Summarization
Author :
Tatar, Doina ; Mihis, Andreea Diana ; Czibula, Gabriela Serban
Author_Institution :
Dept. of Comput. Sci., Univ. Babes-Bolyai, Cluj-Napoca, Romania
Abstract :
In this paper we propose a new method of linear text segmentation based on lexical cohesion of a text. The usual steps (to compute the lexical chains according to relatedness criteria, to score the chains after different parameters,to select the strong chains, to obtain the segments) are replaced by a single procedure. Namely, a single chain of disambiguated words in a text is established and the rips of this single chain are considered as boundaries of the segments of the cohesion structure of the text (CohesionTextTiling or CTT). The summaries of arbitrarily length are obtained by extraction using three different methods applied to the obtained segments. The informativeness of the obtained summaries is compared with the informativeness of the pair summaries of the same length obtained using an earlier method of logical segmentation (coherence segmentation) by text entailment (logical TextTiling or LTT). Some experiments about CTT and LTT methods are made for four rdquoclassicalrdquo texts in summarization literature. The conclusion is that the quality of the summarization using cohesion segmentation (CTT) is better than the quality using logical (coherence) segmentation (LTT).
Keywords :
text analysis; CohesionTextTiling; coherence segmentation; cohesion structure; disambiguated words; lexical chains segmentation; lexical text cohesion; linear text segmentation; logical TextTiling; logical segmentation; relatedness criteria; text entailment; Abstracts; Computer science; Scientific computing; Statistics; Thesauri; Lexical chains; Text segmentation; Text summarization;
Conference_Titel :
Symbolic and Numeric Algorithms for Scientific Computing, 2008. SYNASC '08. 10th International Symposium on
Conference_Location :
Timisoara
Print_ISBN :
978-0-7695-3523-4
DOI :
10.1109/SYNASC.2008.11