Anytime measures for top-k algorithms on exact and fuzzy data sets

  • Authors:
  • Benjamin Arai;Gautam Das;Dimitrios Gunopulos;Nick Koudas

  • Affiliations:
  • University of California, Riverside, USA;University of Texas, Arlington, USA;University of Athens, Athens, Greece;University of Toronto, Toronto, Canada

  • Venue:
  • The VLDB Journal — The International Journal on Very Large Data Bases
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Top-k queries on large multi-attribute data sets are fundamental operations in information retrieval and ranking applications. In this article, we initiate research on the anytime behavior of top-k algorithms on exact and fuzzy data. In particular, given specific top-k algorithms (TA and TA-Sorted) we are interested in studying their progress toward identification of the correct result at any point during the algorithms' execution. We adopt a probabilistic approach where we seek to report at any point of operation of the algorithm the confidence that the top-k result has been identified. Such a functionality can be a valuable asset when one is interested in reducing the runtime cost of top-k computations. We present a thorough experimental evaluation to validate our techniques using both synthetic and real data sets.