Efficient retrieval of replicated data

Authors:
Ali Şaman Tosun
Affiliations:
Department of Computer Science, University of Texas at San Antonio, San Antonio 78249
Venue:
Distributed and Parallel Databases
Year:
2006

Citing 32
Cited 2

Optimal file distribution for partial match retrieval

SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Declustering using error correcting codes

PODS '89 Proceedings of the eighth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
The design and analysis of spatial data structures

The design and analysis of spatial data structures
Parity striping of disc arrays: low-cost reliable storage with acceptable throughput

Proceedings of the sixteenth international conference on Very large databases
Hybrid-range partitioning strategy: a new declustering strategy for multiprocessor databases machines

Proceedings of the sixteenth international conference on Very large databases
The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
A performance analysis of alternative multi-attribute declustering strategies

SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Parity declustering for continuous operation in redundant disk arrays

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Optimal response time retrieval of replicated data (extended abstract)

PODS '94 Proceedings of the thirteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Fast parallel similarity search in multimedia databases

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Efficient disk allocation for fast similarity searching

Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Multidimensional access methods

ACM Computing Surveys (CSUR)
Disk allocation for Cartesian product files on multiple-disk systems

ACM Transactions on Database Systems (TODS)
(Almost) optimal parallel block access to range queries

PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Fast concurrent access to parallel disks

SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Declustering using fractals

PDIS '93 Proceedings of the second international conference on Parallel and distributed information systems
From discrepancy to declustering: near-optimal multidimensional declustering strategies for range queries

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Cyclic Allocation of Two-Dimensional Data

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Optimal Allocation of Two-Dimensional Data

ICDT '97 Proceedings of the 6th International Conference on Database Theory
Study of Scalable Declustering Algorithms for Parallel Grid Files

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
CMD: A Multidimensional Declustering Method for Parallel Data Systems

VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Dynamic Declustering Methods for Parallel Grid Files

Proceedings of the Third International ACPC Conference with Special Emphasis on Parallel Databases and Parallel I/O: Parallel Computation
A General Multidimensional Data Allocation Method for Multicomputer Database Systems

DEXA '97 Proceedings of the 8th International Conference on Database and Expert Systems Applications
Concentric Hyperspaces and Disk Allocation for Fast Parallel Range Searching

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Declustering Using Golden Ratio Sequences

ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Optimal Parallel I/O Using Replication

ICPPW '02 Proceedings of the 2002 International Conference on Parallel Processing Workshops
Replication and retrieval strategies of multidimensional data on parallel disks

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Replicated declustering for arbitrary queries

Proceedings of the 2004 ACM symposium on Applied computing
Replicated declustering of spatial data

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Constrained Declustering

ITCC '05 Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume I - Volume 01
Threshold based declustering in high dimensions

DEXA'05 Proceedings of the 16th international conference on Database and Expert Systems Applications

Threshold-based declustering

Information Sciences: an International Journal
Divide-and-conquer scheme for strictly optimal retrieval of range queries

ACM Transactions on Storage (TOS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Declustering is a common technique used to reduce query response times. Data is declustered over multiple disks and query retrieval can be parallelized. Most of the research on declustering is targeted at spatial range queries and investigates schemes with low additive error. Recently, declustering using replication has been proposed to reduce the additive overhead. Replication significantly reduces retrieval cost of arbitrary queries. In this paper, we propose a disk allocation and retrieval mechanism for arbitrary queries based on design theory. Using the proposed c-copy replicated declustering scheme, $$(c-1)k^{2}+ck$$ buckets can be retrieved using at most k disk accesses. Retrieval algorithm is very efficient and is asymptotically optimal with $$\Theta(|Q|)$$ complexity for a query Q. In addition to the deterministic worst-case bound and efficient retrieval, proposed algorithm handles nonuniform data, high dimensions, supports incremental declustering and has good fault-tolerance property. Experimental results show the feasibility of the algorithm.