DocumentCode :
3026049
Title :
Text document clustering on the basis of inter passage approach by using K-means
Author :
Mishra, Rupesh Kumar ; Saini, Kanika ; Bagri, Sakshi
Author_Institution :
Comput. Sci. & Eng., Manav Rachna Coll. of Eng., Faridabad, India
fYear :
2015
fDate :
15-16 May 2015
Firstpage :
110
Lastpage :
113
Abstract :
Document clustering usually deals with clustering of documents that revolve around a single topic. To achieve more efficient clustering results, it is important to consider the fact that a document may deal with more than one topic. Our research work proposes a new inter-passage based clustering technique which will cluster the segment of the documents on the basis of similarities. The input will be the collection of documents consisting of multi topic segments taken from web. SentiWordNet has been used to calculate the segment score of the segments within the documents. Based upon the segment score segment based clustering is performed on the intra-document level. Once we are done with intra-document segment based clustering then k-means approach is applied to the entire collection of documents to perform inter-document clustering in which the similar segments of various documents will be clustered under a single cluster. Our proposed technique would help in efficient organization of multi topic documents into their corresponding clusters.
Keywords :
learning (artificial intelligence); pattern clustering; text analysis; K-means clustering; SentiWordNet; document similarity; inter passage approach; inter-document clustering; intra-document segment based clustering; segment score segment based clustering; text document clustering; Automation; Cleaning; Clustering algorithms; Computer science; Information retrieval; Organizations; Partitioning algorithms; Document Clustering; Inter Document Clustering; Intra Document Clustering; SentiWordNet; Sentiment analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computing, Communication & Automation (ICCCA), 2015 International Conference on
Conference_Location :
Noida
Print_ISBN :
978-1-4799-8889-1
Type :
conf
DOI :
10.1109/CCAA.2015.7148354
Filename :
7148354
Link To Document :
بازگشت