Title :
Search Results Clustering Based on a Linear Weighting Method of Similarity
Author :
Zheng, Dequan ; Liu, Haibo ; Zhao, Tiejun
Author_Institution :
MOE-MS Key Lab. of Natural Language Process. & Speech, Harbin Inst. of Technol., Harbin, China
Abstract :
The cluster of search results can facilitate users in finding the needed from massive information. But the effect of the traditional text clustering has been verified not good enough. Lingo Algorithm, which adopts LSI for clustering, generates candidate labels first, then distributes the documents, and forms the clusters finally. On the basis of Lingo Algorithm, this paper presents a linear weighted method of Single-Pass improvement, which integrates HowNet semantic similarity and cosine similarity, fuses and rediscovers clusters, and extracting the cluster labels. The experiments have showed that our method it achieves a good results in clusters in the form of purity and F-measure.
Keywords :
information needs; pattern clustering; search problems; text analysis; F-measure; LSI; Lingo algorithm; document handling; information need; linear weighting method; pattern clustering; semantic similarity; text clustering; Clustering algorithms; Feature extraction; Fuses; Matrix decomposition; Search engines; Semantics; Vectors; Cosine similarity; Information retrieval; Lingo algorithm; Semantic similarity; Text clustering;
Conference_Titel :
Asian Language Processing (IALP), 2011 International Conference on
Conference_Location :
Penang
Print_ISBN :
978-1-4577-1733-8
DOI :
10.1109/IALP.2011.72