Visualizing differences in web search algorithms using the expected weighted hoeffding distance

  • Authors:
  • Mingxuan Sun;Guy Lebanon;Kevyn Collins-Thompson

  • Affiliations:
  • Georgia Inst. of Technology, Atlanta, GA, USA;Georgia Inst. of Technology, Atlanta, GA, USA;Microsoft Research, Redmond, WA, USA

  • Venue:
  • Proceedings of the 19th international conference on World wide web
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We introduce a new dissimilarity function for ranked lists, the expected weighted Hoeffding distance, that has several advantages over current dissimilarity measures for ranked search results. First, it is easily customized for users who pay varying degrees of attention to websites at different ranks. Second, unlike existing measures such as generalized Kendall's tau, it is based on a true metric, preserving meaningful embeddings when visualization techniques like multi-dimensional scaling are applied. Third, our measure can effectively handle partial or missing rank information while retaining a probabilistic interpretation. Finally, the measure can be made computationally tractable and we give a highly efficient algorithm for computing it. We then apply our new metric with multi-dimensional scaling to visualize and explore relationships between the result sets from different search engines, showing how the weighted Hoeffding distance can distinguish important differences in search engine behavior that are not apparent with other rank-distance metrics. Such visualizations are highly effective at summarizing and analyzing insights on which search engines to use, what search strategies users can employ, and how search results evolve over time. We demonstrate our techniques using a collection of popular search engines, a representative set of queries, and frequently used query manipulation methods.