Title of article

Frequent Term Based Clustering of Stories with Semantic Analysis for Searching and Retrieval

Author/Authors

Amrut Nagasunder، نويسنده , , Bharath Boregowda، نويسنده , , Madhu Venkatesha، نويسنده , , Ananthanarayana V. S.، نويسنده ,

Issue Information

روزنامه با شماره پیاپی سال 2010

Pages

9

From page

219

To page

227

Abstract

Effective document organizations are often those which provide a concise representation of text content in a large collection ofdocuments. We have considered the task of clustering of stories (documents) as a facilitation of effectual document arrangement for searchingand retrieval. We propose a novel representation for a story, based on the essential parts of speech - the nouns, verbs and adjectives. We thenperform a clustering of these story representations, resulting in a graph structure where the story representations are conjoined at nodes havingthe same or synonymous noun. Such a structure can be queried for stories by giving a search string. We employ the use of a knowledge bankthroughout the system as a step to realize semantic analysis of the text. For testing the goodness of cluster, we carry out the classification test, ontwo data-sets. We are able to achieve significantly high quality of clustering, with promising results in regard to memory compaction

Keywords

Document clustering , semantic analysis , Natural language processing , text mining

Journal title

International Journal of Advanced Research in Computer Science

Serial Year

2010

Journal title

International Journal of Advanced Research in Computer Science

Record number

Frequent Term Based Clustering of Stories with Semantic Analysis for Searching and Retrieval

Amrut Nagasunder، نويسنده , , Bharath Boregowda، نويسنده , , Madhu Venkatesha، نويسنده , , Ananthanarayana V. S.، نويسنده ,

668399