• DocumentCode
    2731959
  • Title

    A Novel Method for Hierarchical Clustering of Search Results

  • Author

    Zhang, Gang ; Liu, Yue ; Tan, Songbo ; Cheng, Xueqi

  • Author_Institution
    Inst. of Comput. Technol., Beijing
  • fYear
    2007
  • fDate
    5-12 Nov. 2007
  • Firstpage
    181
  • Lastpage
    184
  • Abstract
    Search result clustering can help users quickly browse through the documents returned by search engine. Traditional clustering techniques are inadequate since they don´t generate clusters with highly readable names. Label-based clustering is quite promising, which usually takes n-gram (usually bi-gram) as label candidates. However, meaningless n-grams are not removed from the candidates. In this paper, DF, user log and query context are introduced as label ranking features. An integrated model is used to combine these three features for cluster label ranking. Further more, a novel graph based clustering algorithm (GBCA) for hierarchical clustering is proposed. Experiments indicate that the cluster label extraction makes a great improvement (about 8%) over the baseline in precision, and GBCA outperforms STC and Snaket in F-measure.
  • Keywords
    document handling; graph theory; pattern clustering; query processing; search engines; cluster label extraction; document browsing; document frequency; graph based clustering algorithm; hierarchical clustering; label ranking features; label-based clustering; query context; search engine; search result clustering; user log; Animals; Clustering algorithms; Clustering methods; Computers; Conferences; Fractionation; Intelligent agent; Partitioning algorithms; Scattering; Search engines; search result clusteringsearch engineclustering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence and Intelligent Agent Technology Workshops, 2007 IEEE/WIC/ACM International Conferences on
  • Conference_Location
    Silicon Valley, CA
  • Print_ISBN
    0-7695-3028-1
  • Type

    conf

  • DOI
    10.1109/WI-IATW.2007.83
  • Filename
    4427567