Efficient retrieval of replicated data

  • Authors:
  • Ali Şaman Tosun

  • Affiliations:
  • Department of Computer Science, University of Texas at San Antonio, San Antonio 78249

  • Venue:
  • Distributed and Parallel Databases
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Declustering is a common technique used to reduce query response times. Data is declustered over multiple disks and query retrieval can be parallelized. Most of the research on declustering is targeted at spatial range queries and investigates schemes with low additive error. Recently, declustering using replication has been proposed to reduce the additive overhead. Replication significantly reduces retrieval cost of arbitrary queries. In this paper, we propose a disk allocation and retrieval mechanism for arbitrary queries based on design theory. Using the proposed c-copy replicated declustering scheme, $$(c-1)k^{2}+ck$$ buckets can be retrieved using at most k disk accesses. Retrieval algorithm is very efficient and is asymptotically optimal with $$\Theta(|Q|)$$ complexity for a query Q. In addition to the deterministic worst-case bound and efficient retrieval, proposed algorithm handles nonuniform data, high dimensions, supports incremental declustering and has good fault-tolerance property. Experimental results show the feasibility of the algorithm.