Optimal indexing using near-minimal space

Authors:
C. Heeren;H. V. Jagadish;L. Pitt
Affiliations:
University of Illinois at Urbana-Champaign, Urbana, IL;University of Michigan, Ann Arbor, MI;University of Illinois at Urbana-Champaign, Urbana, IL
Venue:
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Year:
2003

Citing 14
Cited 2

e-approximations with minimum packing constraint violation (extended abstract)

STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
Decision theoretic generalizations of the PAC model for neural net and other learning applications

Information and Computation
Implementing data cubes efficiently

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
On the primer selection problem in polymerase chain reaction experiments

Discrete Applied Mathematics - Special volume on computational molecular biology
AutoAdmin “what-if” index analysis utility

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
On the complexity of the view-selection problem

PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
APEX: an adaptive path index for XML data

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Covering indexes for branching path queries

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Index Selection for OLAP

ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Selection of Views to Materialize in a Data Warehouse

ICDT '97 Proceedings of the 6th International Conference on Database Theory
Automated Selection of Materialized Views and Indexes in SQL Databases

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Indexing and Querying XML Data for Regular Path Expressions

Proceedings of the 27th International Conference on Very Large Data Bases
An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
DB2 Advisor: An Optimizer Smart Enough to Recommend its own Indexes

ICDE '00 Proceedings of the 16th International Conference on Data Engineering

Index Selection for Databases: A Hardness Study and a Principled Heuristic Solution

IEEE Transactions on Knowledge and Data Engineering
Optimizing index for taxonomy keyword search

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider the index selection problem. Given either a fixed query workload or an unknown probability distribution on possible future queries, and a bound B on how much space is available to build indices, we seek to build a collection of indices for which the average query response time is minimized. We give strong negative and positive peformance bounds.Let m be the number of queries in the workload. We show how to obtain with high probability a collection of indices using space O(B ln m) for which the average query cost is optB, the optimal performance possible for indices using at most B total space. Moreover, this space relaxation is necessary: unless NP ⊆ nO(log log n), no polynomial time algorithm can guarantee average query cost less than M1--ε optB using space αB, for any constant α, where M is the size of the dataset. We quantify the error in performance introduced by running the algorithm on a sample drawn from a query distribution.