DocumentCode :
3253810
Title :
Semantic document retrieval system using fuzzy clustering and reformulated query
Author :
Murali, Dabbu ; Damodaram, Avula
Author_Institution :
CSE, CMR CET, Hyderabad, India
fYear :
2015
fDate :
19-20 March 2015
Firstpage :
746
Lastpage :
753
Abstract :
In this paper, we develop an algorithm for document retrieval system through clustering process and query basis. Initially, the pre-processing is applied on whole documents to remove the unnecessary words and phrases of every document. Then the clustering process in applied to make the partition of the documents through the proposed semantic similarity measure used in the possibilistic fuzzy c means (PFCM) clustering algorithm. For each cluster, the index constructed, which contains common important keywords of the documents of cluster. Once the user enter the keyword as the input to the system, it will process the keywords with the WORDNET ontology to obtain the neighbourhood keywords and related synset keywords. From the set of keywords obtained from the WORDNET is refined and the refined keywords are matched with the index keywords of the clusters to calculate the matching score. Finally, the documents inside the cluster are released at first as the resultant related documents for the query keyword, which clusters have the maximum matching score values. The experimentation process is carried out with the help of different set of documents to achieve the results, the performance analysis of the proposed approach is estimated by precision, and we proved our proposed approach is outperformed in terms of precision.
Keywords :
document handling; fuzzy set theory; ontologies (artificial intelligence); pattern clustering; query processing; PFCM clustering algorithm; WORDNET ontology; clustering process; documents partition; fuzzy clustering; matching score; neighbourhood keywords; performance analysis; possibilistic fuzzy c means; query basis; query keyword; reformulated query; semantic document retrieval system; Clustering algorithms; Computers; Indexing; Information retrieval; Ontologies; Semantics; Document clustering; Ontology; Semantic similarity measure; WORDNET; possibilistic fuzzy c means;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Engineering and Applications (ICACEA), 2015 International Conference on Advances in
Conference_Location :
Ghaziabad
Type :
conf
DOI :
10.1109/ICACEA.2015.7164788
Filename :
7164788
Link To Document :
بازگشت