• DocumentCode
    2483026
  • Title

    A Statistical Approach for Automatic Text Summarization by Extraction

  • Author

    Chandra, Munesh ; Gupta, Vikrant ; Paul, Santosh Kr

  • Author_Institution
    Sch. of Eng., DIT, Noida, India
  • fYear
    2011
  • fDate
    3-5 June 2011
  • Firstpage
    268
  • Lastpage
    271
  • Abstract
    Automatic Document Summarization is a highly interdisciplinary research area related with computer science as well as cognitive psychology. This Summarization is to compress an original document into a summarized version by extracting almost all of the essential concepts with text mining techniques. This research focuses on developing a statistical automatic text summarization approach, Kmixture probabilistic model, to enhancing the quality of summaries. KSRS employs the K-mixture probabilistic model to establish term weights in a statistical sense, and further identifies the term relationships to derive the semantic relationship significance (SRS) of nouns. Sentences are ranked and extracted based on their semantic relationship significance values. The objective of this research is thus to propose a statistical approach to text summarization. We propose a K-mixture semantic relationship significance (KSRS) approach to enhancing the quality of document summary results. The K-mixture probabilistic model is used to determine the term weights. Term relationships are then investigated to develop the semantic relationship of nouns that manifests sentence semantics. Sentences with significant semantic relationship, nouns are extracted to form the summary accordingly.
  • Keywords
    data mining; statistical analysis; text analysis; automatic document summarization; cognitive psychology; computer science; k-mixture probabilistic model; k-mixture semantic relationship significance; semantic relationship significance; statistical automatic text summarization; text mining techniques; Computer science; Information retrieval; Pragmatics; Probabilistic logic; Semantics; Text categorization; extraction; semantic relationship significance; statistical approach;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Communication Systems and Network Technologies (CSNT), 2011 International Conference on
  • Conference_Location
    Katra, Jammu
  • Print_ISBN
    978-1-4577-0543-4
  • Electronic_ISBN
    978-0-7695-4437-3
  • Type

    conf

  • DOI
    10.1109/CSNT.2011.65
  • Filename
    5966451