• DocumentCode
    2258892
  • Title

    Ikhtasir — A user selected compression ratio Arabic text summarization system

  • Author

    Azmi, Aqil ; Al-Thanyyan, Suha

  • Author_Institution
    Dept of Comput. Sci., King Saud Univ., Riyadh, Saudi Arabia
  • fYear
    2009
  • fDate
    24-27 Sept. 2009
  • Firstpage
    1
  • Lastpage
    7
  • Abstract
    Automatic text summarization is an active research field. The rapid growth of the Web, and the associated information overloading, has injected new life into this research area. In certain languages there has been plenty of research in automatic text summarization. Arabic is not one of them. In this paper we present an automatic extractive Arabic text summarization system where the user can cap the size of the summary. The system does not require learning and employs rhetorical structure theory (RST) along with a sentence scoring scheme, where individual sentences are scored. For output, sentences are selected with an objective of maximizing the overall score of the summary whose size is within the user selected compression ratio. For evaluation, system generated summaries of various lengths were compared against those performed by a professional human. Experiments on sample texts show our system outperforms some of the other existing systems including those that require learning.
  • Keywords
    natural language processing; text analysis; Arabic language; Arabic text summarization system; Ikhtasir; automatic text summarization; information overloading; sentence scoring; user selected compression ratio; Bayesian methods; Computer science; Data mining; Genetic programming; Humans; Information systems; Natural languages; Performance evaluation; Speech; Text processing; Algorithms; Arabic text summarization; natural languages; text processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Language Processing and Knowledge Engineering, 2009. NLP-KE 2009. International Conference on
  • Conference_Location
    Dalian
  • Print_ISBN
    978-1-4244-4538-7
  • Electronic_ISBN
    978-1-4244-4540-0
  • Type

    conf

  • DOI
    10.1109/NLPKE.2009.5313732
  • Filename
    5313732