Title :
Fuzzy discrete correlation for document clustering
Author :
Danesh, Malihe ; Naghibzadeh, Mahmoud ; Harati, Ahad
Author_Institution :
Comput. Eng. Dept., Ferdowsi Univ. of Mashhad, Mashhad, Iran
Abstract :
Nowadays, there is an enormous growth in the quantity of text documents on the Internet, digital libraries and news sources. This has led to an increased interest in developing methods that help users to effectively navigate, summarize, and organize this information. A new method that uses neighbor and link concepts has more suitable performance than previous methods in this field. Two documents are neighbors if their similarity is more than a defined threshold. If they are neighbors, neighbor matrix element is set to one, otherwise it is set to zero. So we lose some information about documents similarity in it and therefore decrease of accuracy. To overcome this problem, we propose two methods of “discrete correlation” and “fuzzy correlation”, which both of them attempt to accurate neighbor definition more and more and so reach better clustering results. To evaluate our work, we used k-means algorithm to determine the initial cluster centers and similarity criteria between documents and centers. The results of applying proposed method on real-world document data sets by information retrieval factors show better performance than traditional algorithms and previous works.
Keywords :
Internet; digital libraries; fuzzy reasoning; information retrieval; pattern clustering; text analysis; Internet; digital libraries; discrete correlation; document clustering; fuzzy correlation; fuzzy discrete correlation; information retrieval factors; k-means algorithm; neighbor matrix element; news sources; pattern clustering; text document; Accuracy; Clustering algorithms; Correlation; Fuzzy sets; Fuzzy systems; Polynomials; Software; correlation; discrete; document clustering; fuzzy; link; neighbor; similarity;
Conference_Titel :
Artificial Intelligence and Signal Processing (AISP), 2011 International Symposium on
Conference_Location :
Tehran
Print_ISBN :
978-1-4244-9833-8
DOI :
10.1109/AISP.2011.5960974