On the Complexity of Distributed Query Optimization

Authors:
Chihping Wang;Ming-Syan Chen
Affiliations:
-;-
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
1996

Citing 31
Cited 13

Distributed databases principles and systems

Distributed databases principles and systems
Optimizing chain queries in a distributed database system.

SIAM Journal on Computing
Optimization of distributed tree queries

Journal of Computer and System Sciences
Query optimization on local area networks

ACM Transactions on Information Systems (TOIS)
Distributed query processing

ACM Computing Surveys (CSUR)
An Optimal Algorithm for Processing Distributed Star Queries

IEEE Transactions on Software Engineering
A state transition model for distributed query processing

ACM Transactions on Database Systems (TODS)
Set query optimization in distributed database systems

ACM Transactions on Database Systems (TODS)
Optimizing Join Queries in Distributed Databases

IEEE Transactions on Software Engineering
Optimizing Joins in Fragmented Database Systems on a Broadcast Local Network

IEEE Transactions on Software Engineering
On the effect of join operations on relation sizes

ACM Transactions on Database Systems (TODS)
Optimizing equijoin queries in distributed databases where relations are hash partitioned

ACM Transactions on Database Systems (TODS)
Join and Semijoin Algorithms for a Multiprocessor Database Machine

ACM Transactions on Database Systems (TODS)
Query processing in a system for distributed databases (SDD-1)

ACM Transactions on Database Systems (TODS)
Tree queries: a simple class of relational queries

ACM Transactions on Database Systems (TODS)
Using Semi-Joins to Solve Relational Queries

Journal of the ACM (JACM)
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Fundamentals of Computer Alori

Fundamentals of Computer Alori
A methodology for interpreting tree queries into optimal semi-join expressions

SIGMOD '80 Proceedings of the 1980 ACM SIGMOD international conference on Management of data
Query processing for distributed databases using generalized semi-joins

SIGMOD '82 Proceedings of the 1982 ACM SIGMOD international conference on Management of data
The tree property is fundamental for query processing

PODS '82 Proceedings of the 1st ACM SIGACT-SIGMOD symposium on Principles of database systems
Algorithms for Distributed Query Processing in Broadcast Local Area Networks

IEEE Transactions on Knowledge and Data Engineering
An Intelligent Search Method for Query Optimization by Semijoins

IEEE Transactions on Knowledge and Data Engineering
Semantic Query Optimization for Tree and Chain Queries

IEEE Transactions on Knowledge and Data Engineering
A Graph Theoretical Approach to Determine a Join Reducer Sequence in Distributed Query Processing

IEEE Transactions on Knowledge and Data Engineering
Interleaving a Join Sequence with Semijoins in Distributed Query Processing

IEEE Transactions on Parallel and Distributed Systems
Combining Joint and Semi-Join Operations for Distributed Query Processing

IEEE Transactions on Knowledge and Data Engineering
Distributed Query Evaluation in Local Area Networks

Proceedings of the First International Conference on Data Engineering
The Relation-Partitioning Approach to Processing Star Queries in Distributed Databases

Proceedings of the Second International Conference on Data Engineering
Distributed Query Optimization by One-Shot Fixed-Precision Semi-Join Execution

Proceedings of the Seventh International Conference on Data Engineering
The optimization of query processing on distributed database systems

The optimization of query processing on distributed database systems

On the complexity of approximate query optimization

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Distributed Query Processing in the Internet: Exploring Relation Replication and Network Characteristics

ICDCS '01 Proceedings of the The 21st International Conference on Distributed Computing Systems
Query Processing in a Mobile Computing Environment: Exploiting the Features of Asymmetry

IEEE Transactions on Knowledge and Data Engineering
Revisiting pipelined parallelism in multi-join query processing

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Optimizing Cyclic Join View Maintenance over Distributed Data Sources

IEEE Transactions on Knowledge and Data Engineering
BioScout: a life-science query monitoring system

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Decomposable algorithms for nearest neighbor computing

Journal of Parallel and Distributed Computing
Scalable multi-query optimization for exploratory queries over federated scientific databases

Proceedings of the VLDB Endowment
A Vision for Next Generation Query Processors and an Associated Research Agenda

Globe '09 Proceedings of the 2nd International Conference on Data Management in Grid and Peer-to-Peer Systems
Extension of the one-shot semijoin strategy to minimize data transmission cost in distributed query processing

Information Sciences: an International Journal
Cluster-and-conquer: hierarchical multi-metric query processing in large-scale database federations

Proceedings of the Fourteenth International Database Engineering & Applications Symposium
Pattern discovery in distributed databases

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Optimizing monitoring queries over distributed data

EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

While a significant amount of research efforts has been reported on developing algorithms, based on joins and semijoins, to tackle distributed query processing, there is relatively little progress made toward exploring the complexity of the problems studied. As a result, proving NP-hardness of or devising polynomial-time algorithms for certain distributed query optimization problems has been elaborated upon by many researchers. However, due to its inherent difficulty, the complexity of the majority of problems on distributed query optimization remains unknown. In this paper we generally characterize the distributed query optimization problems and provide a frame work to explore their complexity. As it will be shown, most distributed query optimization problems can be transformed into an optimization problem comprising a set of binary decisions, termed Sum Product Optimization (SPO) problem. We first prove SPO is NP-hard in light of the NP-completeness of a well-known problem, Knapsack (KNAP). Then, using this result as a basis, we prove that five classes of distributed query optimization problems, which cover the majority of distributed query optimization problems previously studied in the literature, are NP-hard by polynomially reducing SPO to each of them. The detail for each problem transformation is derived. We not only prove the conjecture that many prior studies relied upon, but also provide a frame work for future related studies.