Constant-Time Query Processing

  • Authors:
  • Vijayshankar Raman;Garret Swart;Lin Qiao;Frederick Reiss;Vijay Dialani;Donald Kossmann;Inderpal Narang;Richard Sidle

  • Affiliations:
  • IBM Almaden Research Center, San Jose, CA, USA. ravijay@us;IBM Almaden Research Center, San Jose, CA, USA. garret@swart.org;IBM Almaden Research Center, San Jose, CA, USA. lsqiao@us;IBM Almaden Research Center, San Jose, CA, USA. frreiss@us;IBM Almaden Research Center, San Jose, CA, USA;ETH Zurich, Zurich, Switzerland. donald.kossmann@inf.ethz.ch;IBM Almaden Research Center, San Jose, CA, USA. inarang@us;IBM Almaden Research Center, San Jose, CA, USA. rsidle@almaden.ibm.com

  • Venue:
  • ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Query performance in current systems depends significantly on tuning: how well the query matches the available indexes, materialized views etc. Even in a well tuned system, there are always some queries that take much longer than others. This frustrates users who increasingly want consistent response times to ad hoc queries. We argue that query processors should instead aim for constant response times for all queries, with no assumption about tuning. We present Blink, our first attempt at this goal, that runs every query as a table scan over a fully denormalized database, with hash group-by done along the way. To make this scan efficient, Blink uses a novel compression scheme that horizontally partitions tuples by frequency, thereby compressing skewed data almost down to entropy, even while producing long runs of fixed-length, easily-parseable values. We also present a scheme for evaluating a conjunction of range and equality predicates in SIMD fashion over compressed tuples, and different schemes for efficient hash-based aggregation within the L2 cache. A experimental study with a suite of arbitrary single block SQL queries over a TPCH-like schema suggests that constant-time queries can be efficient.