Distribution design for higher-order data models

  • Authors:
  • Hui Ma;Klaus-Dieter Schewe;Qing Wang

  • Affiliations:
  • Massey University, Department of Information Systems & Information Science Research Centre, Private Bag 11 222, Palmerston North, New Zealand;Massey University, Department of Information Systems & Information Science Research Centre, Private Bag 11 222, Palmerston North, New Zealand;Massey University, Department of Information Systems & Information Science Research Centre, Private Bag 11 222, Palmerston North, New Zealand

  • Venue:
  • Data & Knowledge Engineering
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Distribution design for databases usually addresses the problems of fragmentation, allocation and replication. However, the major purposes of distribution are to improve performance and to increase system reliability. The former aspect is particularly relevant in cases, where the desire to distribute originates from the distributed nature of an organisation with many data needs only arising locally, i.e., some data is retrieved and processed at only one or at most very few locations. Therefore, query optimisation should be treated as an intrinsic part of distribution design. In this paper the effects of fragmentation in databases on query processing are investigated using a query cost model. The considered databases are defined on higher-order data models, i.e., they capture complex value, object oriented and XML-based databases. The emphasis on higher-order data models enables a large variety for schema fragmentation, while at the same time it imposes restrictions on the way schemata can be fragmented. It is shown that the allocation of locations to the nodes of an optimised query tree is only marginally affected by the allocation of fragments. This implies that optimisation of query processing and optimisation of fragment allocation are largely orthogonal to each other, leading to several scenarios for fragment allocation. If elementary fragmentation operations are ordered according to their likeliness to impact on the query costs, a binary search procedure can be adopted to find an ''optimal'' fragmentation and allocation. We underline these findings with experimental results.