Title :
A Novel Method for Hierarchical Clustering of Search Results
Author :
Zhang, Gang ; Liu, Yue ; Tan, Songbo ; Cheng, Xueqi
Author_Institution :
Inst. of Comput. Technol., Beijing
Abstract :
Search result clustering can help users quickly browse through the documents returned by search engine. Traditional clustering techniques are inadequate since they don´t generate clusters with highly readable names. Label-based clustering is quite promising, which usually takes n-gram (usually bi-gram) as label candidates. However, meaningless n-grams are not removed from the candidates. In this paper, DF, user log and query context are introduced as label ranking features. An integrated model is used to combine these three features for cluster label ranking. Further more, a novel graph based clustering algorithm (GBCA) for hierarchical clustering is proposed. Experiments indicate that the cluster label extraction makes a great improvement (about 8%) over the baseline in precision, and GBCA outperforms STC and Snaket in F-measure.
Keywords :
document handling; graph theory; pattern clustering; query processing; search engines; cluster label extraction; document browsing; document frequency; graph based clustering algorithm; hierarchical clustering; label ranking features; label-based clustering; query context; search engine; search result clustering; user log; Animals; Clustering algorithms; Clustering methods; Computers; Conferences; Fractionation; Intelligent agent; Partitioning algorithms; Scattering; Search engines; search result clusteringsearch engineclustering;
Conference_Titel :
Web Intelligence and Intelligent Agent Technology Workshops, 2007 IEEE/WIC/ACM International Conferences on
Conference_Location :
Silicon Valley, CA
Print_ISBN :
0-7695-3028-1
DOI :
10.1109/WI-IATW.2007.83