DocumentCode
3776091
Title
Automatic Bengali news documents summarization by introducing sentence frequency and clustering
Author
Md. Majharul Haque;Suraiya Pervin;Zerina Begum
Author_Institution
Department of Computer Science & Engineering, University of Dhaka, Dhaka-1000, Bangladesh
fYear
2015
Firstpage
156
Lastpage
160
Abstract
A method has been proposed in this paper for Bengali news documents summarization which extracts significant sentences using the four major steps (a) preprocessing, (b) sentence ranking, (c) sentence clustering, and (d) summary generation. The noticeable feature of this method is the incorporation of the sentence frequency where redundancy elimination is a consequence. Another one remarkable aspect is sentence clustering on the basis of similarity ratio among sentences. The summary sentence selection is done from all the clusters so that there will be maximum coverage of information in summary even if information is found scattered in input document. Two sets of human generated summary have been utilized where one is to train the system and another is for performance evaluation. The proposed method has been found better while turning comparison with the latest state-of-the art method of Bengali news documents summarization. The results of performance evaluation show that the average Precision, Recall and F-measure values are 0.608, 0.664 and 0.632 respectively.
Keywords
"Computer science","Information technology","Electronic mail","Redundancy","Performance evaluation","Art","Internet"
Publisher
ieee
Conference_Titel
Computer and Information Technology (ICCIT), 2015 18th International Conference on
Type
conf
DOI
10.1109/ICCITechn.2015.7488060
Filename
7488060
Link To Document