A generic flow algorithm for shared filter ordering problems

Authors:
Zhen Liu;Srinivasan Parthasarathy;Anand Ranganathan;Hao Yang
Affiliations:
IBM T.J. Watson Research Center, Hawthorne, NY, USA;IBM T.J. Watson Research Center, Hawthorne, NY, USA;IBM T.J. Watson Research Center, Hawthorne, NY, USA;IBM T.J. Watson Research Center, Hawthorne, NY, USA
Venue:
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Year:
2008

Citing 21
Cited 5

On the optimal nesting order for computing N-relational joins

ACM Transactions on Database Systems (TODS)
Fast approximation algorithms for fractional packing and covering problems

SFCS '91 Proceedings of the 32nd annual symposium on Foundations of computer science
Predicate migration: optimizing queries with expensive predicates

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Join queries with external text sources: execution and optimization techniques

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
On chromatic sums and distributed resource allocation

Information and Computation
Eddies: continuously adaptive query processing

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
WSQ/DSQ: a practical approach for combined querying of databases and the Web

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Approximation algorithms

Approximation algorithms
Efficient sequences of trials

SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Optimization of Nonrecursive Queries

VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Faster and Simpler Algorithms for Multicommodity Flow and other Fractional Packing Problems.

FOCS '98 Proceedings of the 39th Annual Symposium on Foundations of Computer Science
Efficient information gathering on the Internet

FOCS '96 Proceedings of the 37th Annual Symposium on Foundations of Computer Science
Sequential and Parallel Algorithms for Mixed Packing and Covering

FOCS '01 Proceedings of the 42nd IEEE symposium on Foundations of Computer Science
Adaptive ordering of pipelined stream filters

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Approximating Min Sum Set Cover

Algorithmica
Exploiting Correlated Attributes in Acquisitional Query Processing

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Learning with attribute costs

Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
Flow algorithms for two pipelined filter ordering problems

Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Optimization of continuous queries with shared expensive filters

Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Near-optimal algorithms for shared filter evaluation in data stream systems

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
The pipelined set cover problem

ICDT'05 Proceedings of the 10th international conference on Database Theory

A scalable, predictable join operator for highly concurrent data warehouses

Proceedings of the VLDB Endowment
A distributed approach for optimizing cascaded classifier topologies in real-time stream mining systems

IEEE Transactions on Image Processing
Predictable performance and high query concurrency for data analytics

The VLDB Journal — The International Journal on Very Large Data Bases
Max-throughput for (conservative) k-of-n testing

ISAAC'11 Proceedings of the 22nd international conference on Algorithms and Computation
Parallel pipelined filter ordering with precedence constraints

ACM Transactions on Algorithms (TALG)

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider a fundamental flow maximization problem that arises during the evaluation of multiple overlapping queries defined on a data stream, in a heterogenous parallel environment. Each query is a conjunction of boolean filters, and each filter could be shared across multiple queries. We are required to design an evaluation plan that evaluates filters against stream items in order to determine the set of queries satisfied by each item. The evaluation plan specifies for each item: (i) the subset of filters evaluated for this item and the order of their evaluations, and (ii) the processor on which each filter evaluation occurs. Our goal is to design an evaluation plan which maximizes the total throughput (flow) of the stream handled by the plan, without violating the processor capacities. Filter ordering has received extensive attention in single-processor settings, with the objective of minimizing the total cost of filter evaluations: in particular, efficient (approximation) algorithms are known for various important versions of min-cost filter ordering. Min-cost filter ordering problem for a single processor is a special case of our flow maximization for parallel processors. Our main contribution in this work is a generic flow-maximization algorithm, which assumes the availability of a min-cost filter ordering algorithm for a single processor, and uses this to iteratively construct a solution to the flow-maximization problem for heterogenous parallel processors. We show that the approximation ratio of our flow-maximization strategy is essentially the same as that of the underlying min-cost filter ordering algorithm. Our result, along with existing results on min-cost filter ordering, enables the optimization of several important versions of filter ordering in parallel environments.