DocumentCode
2258892
Title
Ikhtasir — A user selected compression ratio Arabic text summarization system
Author
Azmi, Aqil ; Al-Thanyyan, Suha
Author_Institution
Dept of Comput. Sci., King Saud Univ., Riyadh, Saudi Arabia
fYear
2009
fDate
24-27 Sept. 2009
Firstpage
1
Lastpage
7
Abstract
Automatic text summarization is an active research field. The rapid growth of the Web, and the associated information overloading, has injected new life into this research area. In certain languages there has been plenty of research in automatic text summarization. Arabic is not one of them. In this paper we present an automatic extractive Arabic text summarization system where the user can cap the size of the summary. The system does not require learning and employs rhetorical structure theory (RST) along with a sentence scoring scheme, where individual sentences are scored. For output, sentences are selected with an objective of maximizing the overall score of the summary whose size is within the user selected compression ratio. For evaluation, system generated summaries of various lengths were compared against those performed by a professional human. Experiments on sample texts show our system outperforms some of the other existing systems including those that require learning.
Keywords
natural language processing; text analysis; Arabic language; Arabic text summarization system; Ikhtasir; automatic text summarization; information overloading; sentence scoring; user selected compression ratio; Bayesian methods; Computer science; Data mining; Genetic programming; Humans; Information systems; Natural languages; Performance evaluation; Speech; Text processing; Algorithms; Arabic text summarization; natural languages; text processing;
fLanguage
English
Publisher
ieee
Conference_Titel
Natural Language Processing and Knowledge Engineering, 2009. NLP-KE 2009. International Conference on
Conference_Location
Dalian
Print_ISBN
978-1-4244-4538-7
Electronic_ISBN
978-1-4244-4540-0
Type
conf
DOI
10.1109/NLPKE.2009.5313732
Filename
5313732
Link To Document