Allocating relations in a distributed database system

Authors:
D. J. Reid;M. Orlowska
Affiliations:
Department of Computer Science, The University of Queensland St. Lucia, Queensland 4072, Australia;Department of Computer Science, The University of Queensland St. Lucia, Queensland 4072, Australia
Venue:
Mathematical and Computer Modelling: An International Journal
Year:
1995

Citing 21
Cited 1

Optimizing joins between two partitioned relations in distributed databases

Journal of Parallel and Distributed Computing
Theory of linear and integer programming

Theory of linear and integer programming
Data allocation in distributed database systems

ACM Transactions on Database Systems (TODS)
Optimizing Join Queries in Distributed Databases

IEEE Transactions on Software Engineering
Optimizing equijoin queries in distributed databases where relations are hash partitioned

ACM Transactions on Database Systems (TODS)
Optimal disk allocation for partial match queries

ACM Transactions on Database Systems (TODS)
An improved branch and bound algorithm for mixed integer nonlinear programs

Computers and Operations Research
Genetic algorithm based approach for file allocation on distributed systems

Computers and Operations Research - Special issue on genetic algorithms
An efficient processing of a chain join with the minimum communication cost in distributed database systems

Distributed and Parallel Databases
An integer linear programming approach to data allocation with the minimum total communication cost in distributed database systems

Information Sciences—Informatics and Computer Science: An International Journal
The theory of joins in relational databases

ACM Transactions on Database Systems (TODS)
Independent components of relations

ACM Transactions on Database Systems (TODS)
Using Semi-Joins to Solve Relational Queries

Journal of the ACM (JACM)
On the Desirability of Acyclic Database Schemes

Journal of the ACM (JACM)
Degrees of acyclicity for hypergraphs and relational database schemes

Journal of the ACM (JACM)
Query processing utilizing dependencies and horizontal decomposition

SIGMOD '83 Proceedings of the 1983 ACM SIGMOD international conference on Management of data
Query processing for distributed databases using generalized semi-joins

SIGMOD '82 Proceedings of the 1982 ACM SIGMOD international conference on Management of data
Optimal File Allocation in a Multiple Computer System

IEEE Transactions on Computers
The complexity of processing tree queries in distributed databases

SPDP '90 Proceedings of the 1990 IEEE Second Symposium on Parallel and Distributed Processing
Incorporating processor costs in optimizing the distributed execution of join queries

Mathematical and Computer Modelling: An International Journal
Evaluating multiple join queries in a distributed database system

Mathematical and Computer Modelling: An International Journal

Minimizing the response time of executing a join between fragmented relations in a distributed database system

Mathematical and Computer Modelling: An International Journal

Quantified Score

Hi-index	0.98

Visualization

Abstract

A model is proposed that allocates tables of a relational database to the sites of a distributed system in order that the total cost of executing a given collection of join queries is minimized. This model is presented in the convenient form of an integer linear program. Each individual query specifies that several logically distinct data sets, or relations, are to be amalgamated and presented to the particular user that issued the request. Performing this task requires the utilization of limited system resources; both processors, and the communications facilities that interconnect them, may be used. An optimal strategy for executing a single query is, therefore, defined to be one that minimizes a weighted sum of the costs of computation, and those of information interchange, incurred during the computation. One particular model, appearing in [1], conforms to this philosophy, and so forms the basis for further investigations. The total cost of executing an entire group of such queries depends upon the way in which the relevant information is allocated to the sites of the network. Several copies of any particular relation may be dispersed across the network; the replication of data increases its availability, and potentially decreases the costs of answering the given requests. However, only limited storage capacities are available, and increased replication commands greater overheads in maintaining consistency. An optimization program is developed to design a data allocation plan that achieves a minimal total cost for the execution of a given group of requests, while maintaining restraints on the levels of data replication considered permissible.