• DocumentCode
    3319277
  • Title

    Automatic text summarization based on sentences clustering and extraction

  • Author

    Zhang, Pei-Ying ; Li, Cun-He

  • Author_Institution
    Coll. of Comput. & Commun. Eng., China Univ. of Pet., Dongying, China
  • fYear
    2009
  • fDate
    8-11 Aug. 2009
  • Firstpage
    167
  • Lastpage
    170
  • Abstract
    Technology of automatic text summarization plays an important role in information retrieval and text classification, and may provide a solution to the information overload problem. Text summarization is a process of reducing the size of a text while preserving its information content. This paper proposes a sentences clustering based summarization approach. The proposed approach consists of three steps: first clusters the sentences based on the semantic distance among sentences in the document, and then on each cluster calculates the accumulative sentence similarity based on the multi-features combination method, at last chooses the topic sentences by some extraction rules. The purpose of present paper is to show that summarization result is not only depends the sentence features, but also depends on the sentence similarity measure. The experimental result on the DUC 2003 dataset show that our proposed approach can improve the performance compared to other summarization methods.
  • Keywords
    classification; information retrieval; pattern clustering; text analysis; automatic text summarization; document sentence; information overload problem; information retrieval; multifeature combination method; sentence clustering; sentence extraction; text classification; Clustering algorithms; Data mining; Educational institutions; Information retrieval; Natural language processing; Petroleum; Text categorization; Volume measurement; Web sites; sentence extractive technique; sentences clustering; similarity measure; text summarization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science and Information Technology, 2009. ICCSIT 2009. 2nd IEEE International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4244-4519-6
  • Electronic_ISBN
    978-1-4244-4520-2
  • Type

    conf

  • DOI
    10.1109/ICCSIT.2009.5234971
  • Filename
    5234971