• DocumentCode
    2894100
  • Title

    A Novel Chinese Multi-Document Summarization Using Clustering Based Sentence Extraction

  • Author

    Liu, De-xi ; He, Yan-Xiang ; Ji, Dong-hong ; Yang, Hua

  • Author_Institution
    Sch. of Phys., Xiangfan Univ.
  • fYear
    2006
  • fDate
    13-16 Aug. 2006
  • Firstpage
    2592
  • Lastpage
    2597
  • Abstract
    This paper proposes a strategy for Chinese multi-document summarization based on clustering and sentence extraction. It adopts the term vector to represent the linguistic unit in Chinese document, which obtains higher representation quality than traditional word-based vector space model in a certain extent. As for clustering, we propose two heuristics to automatically detect the proper number of clusters: the first one makes full use of the summary length fixed by the user; the second is a stability method, which has been applied to other unsupervised learning problems. We also discuss a global searching method for sentence selection from the clusters. To evaluate our summarization strategy, an extrinsic evaluation method based on classification task is adopted. Experimental results on news document set show that the new strategy can significantly enhance the performance of Chinese multi-document summarization
  • Keywords
    natural languages; pattern classification; pattern clustering; text analysis; unsupervised learning; Chinese multidocument summarization; extrinsic evaluation method; global searching method; heuristics; linguistic unit; pattern classification; pattern clustering; sentence extraction; stability method; term vector space; unsupervised learning; Computer science; Cybernetics; Data mining; Helium; Information retrieval; Instruments; Machine learning; Mathematical model; Natural languages; Ontologies; Physics computing; Stability; Unsupervised learning; Chinese Multi-document summarization; extrinsic evaluation; global searching method; stability method; term vector space;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics, 2006 International Conference on
  • Conference_Location
    Dalian, China
  • Print_ISBN
    1-4244-0061-9
  • Type

    conf

  • DOI
    10.1109/ICMLC.2006.258855
  • Filename
    4028501