Optimal file distribution for partial match retrieval
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Declustering using error correcting codes
PODS '89 Proceedings of the eighth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
The design and analysis of spatial data structures
The design and analysis of spatial data structures
Parity striping of disc arrays: low-cost reliable storage with acceptable throughput
Proceedings of the sixteenth international conference on Very large databases
Proceedings of the sixteenth international conference on Very large databases
The R*-tree: an efficient and robust access method for points and rectangles
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
A performance analysis of alternative multi-attribute declustering strategies
SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Optimal response time retrieval of replicated data (extended abstract)
PODS '94 Proceedings of the thirteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Using rotational mirrored declustering for replica placement in a disk-array-based video server
Proceedings of the third ACM international conference on Multimedia
Fast parallel similarity search in multimedia databases
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Efficient disk allocation for fast similarity searching
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Multidimensional access methods
ACM Computing Surveys (CSUR)
Disk allocation for Cartesian product files on multiple-disk systems
ACM Transactions on Database Systems (TODS)
(Almost) optimal parallel block access to range queries
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Approximation algorithms for data placement on parallel disks
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Fast concurrent access to parallel disks
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
PDIS '93 Proceedings of the second international conference on Parallel and distributed information systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
R-trees: a dynamic index structure for spatial searching
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Cyclic Allocation of Two-Dimensional Data
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Optimal Allocation of Two-Dimensional Data
ICDT '97 Proceedings of the 6th International Conference on Database Theory
Study of Scalable Declustering Algorithms for Parallel Grid Files
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
CMD: A Multidimensional Declustering Method for Parallel Data Systems
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Dynamic Declustering Methods for Parallel Grid Files
Proceedings of the Third International ACPC Conference with Special Emphasis on Parallel Databases and Parallel I/O: Parallel Computation
A General Multidimensional Data Allocation Method for Multicomputer Database Systems
DEXA '97 Proceedings of the 8th International Conference on Database and Expert Systems Applications
Concentric Hyperspaces and Disk Allocation for Fast Parallel Range Searching
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Declustering Using Golden Ratio Sequences
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Optimal Parallel I/O Using Replication
ICPPW '02 Proceedings of the 2002 International Conference on Parallel Processing Workshops
Replicated declustering for arbitrary queries
Proceedings of the 2004 ACM symposium on Applied computing
Efficient retrieval of replicated data
Distributed and Parallel Databases
Efficient parallel processing of range queries through replicated declustering
Distributed and Parallel Databases
Data space mapping for efficient I/O in large multi-dimensional databases
Information Systems
Information Sciences: an International Journal
Proceedings of the 2007 ACM symposium on Applied computing
Divide-and-conquer scheme for strictly optimal retrieval of range queries
ACM Transactions on Storage (TOS)
Threshold based declustering in high dimensions
DEXA'05 Proceedings of the 16th international conference on Database and Expert Systems Applications
Generalized Optimal Response Time Retrieval of Replicated Data from Storage Arrays
ACM Transactions on Storage (TOS)
Hi-index | 0.00 |
The problem of disk declustering is to distribute data among multiple disks to reduce query response times through parallel I/O. A strictly optimal declustering technique is one that achieves optimal parallel I/O for all possible queries. In this paper, we focus on techniques that are optimized for spatial range queries. Current declustering techniques, which have single copies of the data, have been shown to be suboptimal for range queries. The lower bound on extra disk accesses is proved to be Ω(log N) for N disks even in the restricted case of an N-by-N grid, and all current approaches have been trying to achieve this bound. Replication is a well-known and effective solution for several problems in databases, especially for availability and load balancing. In this paper, we explore the idea of replication in the context of declustering and propose a framework where strictly optimal parallel I/O is achievable using a small amount of replication. We provide some theoretical foundations for replicated declustering, e.g., a bound for number of copies for strict optimality on any number of disks, and propose a class of replicated declustering schemes, periodic allocations, which are shown to be strictly optimal. The results for optimal disk allocation are extended for larger number of disks by increasing replication. Our techniques and results are valid for any arbitrary a-by-b grids, and any declustering scheme can be further improved using our replication framework. Using the framework, we perform experiments to identify a strictly optimal disk access schedule for any given arbitrary range query. In addition to the theoretical bounds, we compare the proposed replication based scheme to other existing techniques by performing experiments on real datasets.