DocumentCode :
671513
Title :
Coupled term-term relation analysis for document clustering
Author :
Xin Cheng ; Duoqian Miao ; Can Wang ; Longbing Cao
Author_Institution :
Dept. of Comput. Sci. & Technol., Tongji Univ., Shanghai, China
fYear :
2013
fDate :
4-9 Aug. 2013
Firstpage :
1
Lastpage :
8
Abstract :
Traditional document clustering approaches are usually based on the Bag of Words model, which is limited by its assumption of the independence among terms. Recent strategies have been proposed to capture the relation between terms based on statistical analysis, and they estimate the relation between terms purely by their co-occurrence across the documents. However, the implicit interactions with other link terms are overlooked, which leads to the discovery of incomplete information. This paper proposes a coupled term-term relation model for document representation, which considers both the intra-relation (i.e. co-occurrence of terms) and inter-relation (i.e. dependency of terms via link terms) between a pair of terms. The coupled relation for each pair of terms is further used to map a document onto a new feature space, which includes more semantic information. Substantial experiments verify that the document clustering incorporated with our proposed relation achieves a significant performance improvement compared to the state-of-the-art techniques.
Keywords :
data mining; document handling; pattern clustering; statistical analysis; coupled term-term relation analysis; document clustering; document representation; feature space; semantic information; statistical analysis; Computer science; Context; Data mining; Frequency measurement; Semantics; Sparse matrices; Vectors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks (IJCNN), The 2013 International Joint Conference on
Conference_Location :
Dallas, TX
ISSN :
2161-4393
Print_ISBN :
978-1-4673-6128-6
Type :
conf
DOI :
10.1109/IJCNN.2013.6706853
Filename :
6706853
Link To Document :
بازگشت