Exploring the magic of WAND

  • Authors:
  • Matthias Petri;J. Shane Culpepper;Alistair Moffat

  • Affiliations:
  • RMIT University, Australia and The University of Melbourne, Australia;RMIT University, Australia;The University of Melbourne, Australia

  • Venue:
  • Proceedings of the 18th Australasian Document Computing Symposium
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Web search services process thousands of queries per second, and filter their answers from collections containing very large amounts of data. Fast response to queries is a critical service expectation. The well-known WAND processing strategy is one way of reducing the amount of computation necessary when executing such a query. The value of WAND has now been validated in a wide range of studies, and has become one of the key baselines against which all new top-k processing algorithms are benchmarked. However, most previous implementations of WAND-based retrieval approaches have been in the context of the BM25 Okapi similarity scoring regime. Here we measure the performance of WAND in the context of the alternative Language Model similarity score computation, and find that the dramatic efficiency gains reported in previous studies are no longer achievable. That is, when the primary goal of a retrieval system is to maximize effectiveness, WAND is relatively unhelpful in terms of attaining the secondary objective of maximizing query throughput rates. However, the BM-WAND algorithm does in fact help reducing the percentage of postings to be scored, but with additional computational overhead. We explore a variety of tradeoffs between scoring metric and processing regime and present new insight into how score-safe algorithms interact with rank scoring.