Combinatorial optimization
Optimization techniques for queries with expensive methods
ACM Transactions on Database Systems (TODS)
Eddies: continuously adaptive query processing
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Processing Queries with Expensive Functions and Large Objects in Distributed Mediator Systems
Proceedings of the 17th International Conference on Data Engineering
Query Optimization in the Presence of Foreign Functions
VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
Randomized Approximation Algorithms for Query Optimization Problems on Two Processors
ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
Query strategies for priced information
Journal of Computer and System Sciences - Special issue on STOC 2000
A new strategy for querying priced information
Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
On the competitive ratio of evaluating priced functions
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Probabilistic computations: Toward a unified measure of complexity
SFCS '77 Proceedings of the 18th Annual Symposium on Foundations of Computer Science
An optimal algorithm for querying priced information: monotone boolean functions and game trees
ESA'05 Proceedings of the 13th annual European conference on Algorithms
Hi-index | 0.00 |
Query optimization that involves expensive predicates has received considerable attention in the database community. Typically, the output to a database query is a set of tuples that satisfy certain conditions, and, with expensive predicates, these conditions may be computationally costly to verify. In the simplest case, when the query looks for the set of tuples that simultaneously satisfy k expensive predicates, the problem reduces to ordering the evaluation of the predicates so as to minimize the time to output the set of tuples comprising the answer to the query. We study different cases of the problem: the sequential case, in which a single processor is available to evaluate the predicates, and the distributed case, in which there are k processors available, each dedicated to a different attribute (column) of the database, and there is no communication cost between the processors. For the sequential case, we give a simple and fast deterministic k-approximation algorithm, and prove that k is the best possible approximation ratio for a deterministic algorithm, even if exponential time algorithms are allowed. We also propose a randomized, polynomial time algorithm with expected approximation ratio 1 + &sqrt;2/2 ≈ 1.707 for k = 2, and prove that 3/2 is the best possible expected approximation ratio for randomized algorithms. We also show that given 0 ≤ ϵ ≤ 1, no randomized algorithm achieves approximation ratio smaller than 1 + ϵ with probability larger than (1 + ϵ)/2. For the distributed case, we consider two different models: the preemptive model, in which a processor is allowed to interrupt the evaluation of a predicate, and the nonpreemptive model, in which the evaluation of a predicate must be completed once started. We show that k is the best possible approximation ratio for a deterministic algorithm, even if exponential time algorithms are allowed. For the preemptive model, we introduce a polynomial time k-approximation algorithm. For the nonpreemptive model, we introduce a polynomial time O(k log2 k)-approximation algorithm.