DocumentCode :
685819
Title :
Comparative analysis of similarity measures in document clustering
Author :
Kavitha Karun, A. ; Mintu, Philip ; Lubna, K.
Author_Institution :
Dept. of Comput. Sci. & Eng., Rajagiri Sch. of Eng. & Technol., Kochi, India
fYear :
2013
fDate :
12-14 Dec. 2013
Firstpage :
857
Lastpage :
860
Abstract :
Rapid breakthrough in science and technology paved way for the accumulation of bulk of data. Extracting useful and meaningful data from this gargantuan amount of data is a tedious process. This has resulted in the development of efficient Data mining methods to discover interesting unknown knowledge from a large amount of data. Document mining or Text mining refers to data mining techniques to extract interesting and nontrivial information and knowledge from unstructured text. Document clustering is an effective Text mining method which classifies similar documents in to a group. Similarity measures play a key role in clustering documents. In this, a comparative study on the effect of various similarity measures in clustering documents in the same data set is done.
Keywords :
data mining; information retrieval; pattern classification; pattern clustering; text analysis; comparative analysis; data mining methods; document clustering; document mining; information extraction; similarity measures; text mining; Clustering algorithms; Computational modeling; Correlation coefficient; Euclidean distance; Text mining; Vectors; Clusters; Document Clustering; similarity measures;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Green Computing, Communication and Conservation of Energy (ICGCE), 2013 International Conference on
Conference_Location :
Chennai
Type :
conf
DOI :
10.1109/ICGCE.2013.6823554
Filename :
6823554
Link To Document :
بازگشت