Replicated declustering for arbitrary queries

  • Authors:
  • Ali Şaman Tosun

  • Affiliations:
  • University of Texas at San Antonio, San Antonio, TX

  • Venue:
  • Proceedings of the 2004 ACM symposium on Applied computing
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Declustering have attracted a lot of interest over the couple of years. Recently, declustering using replication is proposed to reduce the additive overhead of declustering. Most of the work on declustering focuses on spatial range queries. However, in many scenarios including multi-user environments, query shapes can be arbitrary. In this paper, we explore replicated declustering for arbitrary queries. Replication reduces the cost of arbitrary queries to manageable levels. First, we investigate theoretically what is possible using replication for arbitrary queries. Then, we propose a 2-copy replication strategy that achieves the theoretical limit and therefore is the best possible scheme. Using proposed scheme, an arbitrary query containing b buckets requires disk accesses bounded by [√b] This is a significant improvement especially for small queries because using a single copy b buckets require min (b, N) disk accesses in the worst case even for small queries. Proposed scheme works for nonuniform data as well as uniform data. Finally, we extend the proposed scheme to a partial replication scheme to achieve best performance using limited replication.