Title of article
Frequent Term Based Clustering of Stories with Semantic Analysis for Searching and Retrieval
Author/Authors
Amrut Nagasunder، نويسنده , , Bharath Boregowda، نويسنده , , Madhu Venkatesha، نويسنده , , Ananthanarayana V. S.، نويسنده ,
Issue Information
روزنامه با شماره پیاپی سال 2010
Pages
9
From page
219
To page
227
Abstract
Effective document organizations are often those which provide a concise representation of text content in a large collection ofdocuments. We have considered the task of clustering of stories (documents) as a facilitation of effectual document arrangement for searchingand retrieval. We propose a novel representation for a story, based on the essential parts of speech - the nouns, verbs and adjectives. We thenperform a clustering of these story representations, resulting in a graph structure where the story representations are conjoined at nodes havingthe same or synonymous noun. Such a structure can be queried for stories by giving a search string. We employ the use of a knowledge bankthroughout the system as a step to realize semantic analysis of the text. For testing the goodness of cluster, we carry out the classification test, ontwo data-sets. We are able to achieve significantly high quality of clustering, with promising results in regard to memory compaction
Keywords
Document clustering , semantic analysis , Natural language processing , text mining
Journal title
International Journal of Advanced Research in Computer Science
Serial Year
2010
Journal title
International Journal of Advanced Research in Computer Science
Record number
668399
Link To Document