DocumentCode :
541762
Title :
Clustering based optimal summary generation using Genetic Algorithm
Author :
Kogilavani, A. ; Balasubramanie, P.
Author_Institution :
Dept. of Comput. Sci. & Eng., Kongu Eng. Coll., Perundurai, India
fYear :
2010
fDate :
27-29 Dec. 2010
Firstpage :
324
Lastpage :
329
Abstract :
This paper presents Genetic Algorithm based sentence extraction strategy and threshold based document clustering algorithm to produce cluster wise optimal summary. Related documents are grouped into same cluster using threshold based document clustering algorithm. From each cluster important sentences are selected using feature profile which is generated by considering sentence specific features like word weight, sentence position, sentence length, sentence centrality, proper nouns in the sentence and numerical data in the sentence. Based on the feature profile sentence score is calculated for each sentence. To produce optimal summary fitness function is employed which is based on summary quality criteria like maximizing length, coverage and informativeness while minimizing the redundancy. Machine generated summaries are compared against human summaries using Precision, Recall, F-measure and ROUGE-1 measure. The experimental results shows that the proposed approach is efficient and outperforms than the existing multi-document summarization system based on genetic algorithm (MSBGA) approach.
Keywords :
genetic algorithms; pattern clustering; text analysis; document clustering algorithm; genetic algorithm; multidocument summarization system; proper nouns; sentence centrality; sentence extraction strategy; sentence length; sentence position; word weight; Biological cells; Clustering algorithms; Data mining; Feature extraction; Gallium; Humans; Redundancy; Document Clustering; Feature Profile; Linguistic Analysis; Multi-Document Summarization; Optimal Summary; Sentence Extraction; Sentence Score; Sentence Specific Features;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Communication and Computational Intelligence (INCOCCI), 2010 International Conference on
Conference_Location :
Erode
Type :
conf
Filename :
5738751
Link To Document :
بازگشت