• DocumentCode
    3756945
  • Title

    Summary Sentence Classification Using Stylometry

  • Author

    Rushdi Shams;Robert E. Mercer

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Western Ontario, London, ON, Canada
  • fYear
    2015
  • Firstpage
    1220
  • Lastpage
    1227
  • Abstract
    Summary sentence classification is an important step to generate document surrogates known as summary extracts. The quality of an extract depends much on the correctness of this step. We aim to classify potential summary sentences using a statistical learning method that models sentences according to a linguistic technique which examines writing styles, known as Stylometry. The sentences in documents are represented using a novel set of stylometric attributes. For learning, an innovative two-stage classification is set up that comprises two learners in subsequent steps: k-Nearest Neighbour and Naive Bayes. We train and test the learners with the newswire documents collected from two benchmark datasets, viz., the CAST and the DUC2002 datasets. Extensive experimentation strongly suggests that our method has outstanding performance for the single document summarization task. However, its performance is mixed for classifying summary sentences from multiple documents. Finally, comparisons show that our method performs significantly better than most of the popular extractive summarization methods.
  • Keywords
    "Indexes","Complexity theory","Pragmatics","Writing","Semantics","Benchmark testing","Data mining"
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Applications (ICMLA), 2015 IEEE 14th International Conference on
  • Type

    conf

  • DOI
    10.1109/ICMLA.2015.181
  • Filename
    7424488