DocumentCode
2579829
Title
A Hierarchical Text Clustering Algorithm with Cognitive Situation Dimensions
Author
Guo, Yi ; Shao, Zhiqing ; Hua, Nan
Author_Institution
Dept. of Comput. Sci. & Eng., East China Univ. of Sci. & Technol., Shanghai
fYear
2009
fDate
23-25 Jan. 2009
Firstpage
11
Lastpage
14
Abstract
Text clustering is an important task of text mining. The purpose of text clustering is grouping similar text documents together efficiently to meet human interests in information searching and understanding. The procedure of clustering should involve a cognitive process of text understanding or comprehension.This paper introduces an innovative research effort, CogHTC, a hierarchical text clustering algorithm, inspired by cognitive situation models. CogHTC extracts representative features from four elaborately selected cognitive situation dimensions with consideration of the clustering efficiency. The experimental results testified good performance of CogHTC, and revealed that the clustering results of CogHTC are class or domain sensitive, and CogHTC performed better on cross-class clustering than inner- class clustering.
Keywords
cognitive systems; data mining; pattern clustering; query formulation; text analysis; cognitive situation dimensions; cross-class clustering; hierarchical text clustering; information searching; information understanding; inner-class clustering; text documents grouping; text mining; Clustering algorithms; Computer science; Data engineering; Data mining; Feature extraction; Frequency; Humans; Knowledge engineering; Testing; Text mining; Cognitive; Hierarchical; Situation Dimensions; Text Clustering;
fLanguage
English
Publisher
ieee
Conference_Titel
Knowledge Discovery and Data Mining, 2009. WKDD 2009. Second International Workshop on
Conference_Location
Moscow
Print_ISBN
978-0-7695-3543-2
Type
conf
DOI
10.1109/WKDD.2009.17
Filename
4771866
Link To Document