DocumentCode :
2579829
Title :
A Hierarchical Text Clustering Algorithm with Cognitive Situation Dimensions
Author :
Guo, Yi ; Shao, Zhiqing ; Hua, Nan
Author_Institution :
Dept. of Comput. Sci. & Eng., East China Univ. of Sci. & Technol., Shanghai
fYear :
2009
fDate :
23-25 Jan. 2009
Firstpage :
11
Lastpage :
14
Abstract :
Text clustering is an important task of text mining. The purpose of text clustering is grouping similar text documents together efficiently to meet human interests in information searching and understanding. The procedure of clustering should involve a cognitive process of text understanding or comprehension.This paper introduces an innovative research effort, CogHTC, a hierarchical text clustering algorithm, inspired by cognitive situation models. CogHTC extracts representative features from four elaborately selected cognitive situation dimensions with consideration of the clustering efficiency. The experimental results testified good performance of CogHTC, and revealed that the clustering results of CogHTC are class or domain sensitive, and CogHTC performed better on cross-class clustering than inner- class clustering.
Keywords :
cognitive systems; data mining; pattern clustering; query formulation; text analysis; cognitive situation dimensions; cross-class clustering; hierarchical text clustering; information searching; information understanding; inner-class clustering; text documents grouping; text mining; Clustering algorithms; Computer science; Data engineering; Data mining; Feature extraction; Frequency; Humans; Knowledge engineering; Testing; Text mining; Cognitive; Hierarchical; Situation Dimensions; Text Clustering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Knowledge Discovery and Data Mining, 2009. WKDD 2009. Second International Workshop on
Conference_Location :
Moscow
Print_ISBN :
978-0-7695-3543-2
Type :
conf
DOI :
10.1109/WKDD.2009.17
Filename :
4771866
Link To Document :
بازگشت