Allocating relations in a distributed database system

  • Authors:
  • D. J. Reid;M. Orlowska

  • Affiliations:
  • Department of Computer Science, The University of Queensland St. Lucia, Queensland 4072, Australia;Department of Computer Science, The University of Queensland St. Lucia, Queensland 4072, Australia

  • Venue:
  • Mathematical and Computer Modelling: An International Journal
  • Year:
  • 1995

Quantified Score

Hi-index 0.98

Visualization

Abstract

A model is proposed that allocates tables of a relational database to the sites of a distributed system in order that the total cost of executing a given collection of join queries is minimized. This model is presented in the convenient form of an integer linear program. Each individual query specifies that several logically distinct data sets, or relations, are to be amalgamated and presented to the particular user that issued the request. Performing this task requires the utilization of limited system resources; both processors, and the communications facilities that interconnect them, may be used. An optimal strategy for executing a single query is, therefore, defined to be one that minimizes a weighted sum of the costs of computation, and those of information interchange, incurred during the computation. One particular model, appearing in [1], conforms to this philosophy, and so forms the basis for further investigations. The total cost of executing an entire group of such queries depends upon the way in which the relevant information is allocated to the sites of the network. Several copies of any particular relation may be dispersed across the network; the replication of data increases its availability, and potentially decreases the costs of answering the given requests. However, only limited storage capacities are available, and increased replication commands greater overheads in maintaining consistency. An optimization program is developed to design a data allocation plan that achieves a minimal total cost for the execution of a given group of requests, while maintaining restraints on the levels of data replication considered permissible.