• Title of article

    Frequent Term Based Clustering of Stories with Semantic Analysis for Searching and Retrieval

  • Author/Authors

    Amrut Nagasunder، نويسنده , , Bharath Boregowda، نويسنده , , Madhu Venkatesha، نويسنده , , Ananthanarayana V. S.، نويسنده ,

  • Issue Information
    روزنامه با شماره پیاپی سال 2010
  • Pages
    9
  • From page
    219
  • To page
    227
  • Abstract
    Effective document organizations are often those which provide a concise representation of text content in a large collection ofdocuments. We have considered the task of clustering of stories (documents) as a facilitation of effectual document arrangement for searchingand retrieval. We propose a novel representation for a story, based on the essential parts of speech - the nouns, verbs and adjectives. We thenperform a clustering of these story representations, resulting in a graph structure where the story representations are conjoined at nodes havingthe same or synonymous noun. Such a structure can be queried for stories by giving a search string. We employ the use of a knowledge bankthroughout the system as a step to realize semantic analysis of the text. For testing the goodness of cluster, we carry out the classification test, ontwo data-sets. We are able to achieve significantly high quality of clustering, with promising results in regard to memory compaction
  • Keywords
    Document clustering , semantic analysis , Natural language processing , text mining
  • Journal title
    International Journal of Advanced Research in Computer Science
  • Serial Year
    2010
  • Journal title
    International Journal of Advanced Research in Computer Science
  • Record number

    668399