A Novel Method for Hierarchical Clustering of Search Results

  • Authors:
  • Gang Zhang;Yue Liu;Songbo Tan;Xueqi Cheng

  • Affiliations:
  • -;-;-;-

  • Venue:
  • WI-IATW '07 Proceedings of the 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Search result clustering can help users quickly browse through the documents returned by search engine. Traditional clustering techniques are inadequate since they don't generate clusters with highly readable names. Label-based clustering is quite promising, which usually takes n-gram (usually bi-gram) as label candidates. However, meaningless n-grams are not removed from the candidates. In this paper, DF, user log and query context are introduced as label ranking features. An integrated model is used to combine these three features for cluster label ranking. Further more, a novel graph based clustering algorithm (GBCA) for hierarchical clustering is proposed. Experiments indicate that the cluster label extraction makes a great improvement (about 8%) over the baseline in precision, and GBCA outperforms STC and Snaket in F-Measure.