Title of article :
Minimum redundancy and maximum relevance for single and multi-document Arabic text summarization
Author/Authors :
Oufaida, Houda Ecole Nationale Supe´rieure d’Informatique (ESI), Algeria , Nouali, Omar Research Center on Scientific and Technical Information (CERIST), Algeria , Blache, Philippe Aix Marseille Universite - Centre national de la recherche scientifique (CNRS), France
From page :
450
To page :
461
Abstract :
Automatic text summarization aims to produce summaries for one or more texts using machine techniques. In this paper, we propose a novel statistical summarization system for Arabic texts. Our system uses a clustering algorithm and an adapted discriminant analysis method: mRMR (minimum redundancy and maximum relevance) to score terms. Through mRMR analysis, terms are ranked according to their discriminant and coverage power. Second, we propose a novel sentence extraction algorithm which selects sentences with top ranked terms and maximum diversity.Our system uses minimal language-dependant processing: sentence splitting, tokenization and root extraction. Experimental results on EASC and TAC 2011 MultiLingual datasets showed that our proposed approach is competitive to the state of the art systems.
Keywords :
Arabic text summarization , Sentence extraction , mRMR , Minimum redundancy , Maximum relevance
Journal title :
Journal Of King Saud University - Computer an‎d Information Sciences
Journal title :
Journal Of King Saud University - Computer an‎d Information Sciences
Record number :
2609805
Link To Document :
بازگشت