Replicated declustering of spatial data

Authors:
Hakan Ferhatosmanoǧlu;Ali Şaman Tosun;Aravind Ramachandran
Affiliations:
Ohio State University, Columbus, OH;University of Texas, San Antonio, TX;Ohio State University, Columbus, OH
Venue:
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Year:
2004

Citing 29
Cited 8

Optimal file distribution for partial match retrieval

SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Declustering using error correcting codes

PODS '89 Proceedings of the eighth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
The design and analysis of spatial data structures

The design and analysis of spatial data structures
Parity striping of disc arrays: low-cost reliable storage with acceptable throughput

Proceedings of the sixteenth international conference on Very large databases
Hybrid-range partitioning strategy: a new declustering strategy for multiprocessor databases machines

Proceedings of the sixteenth international conference on Very large databases
The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
A performance analysis of alternative multi-attribute declustering strategies

SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Optimal response time retrieval of replicated data (extended abstract)

PODS '94 Proceedings of the thirteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Using rotational mirrored declustering for replica placement in a disk-array-based video server

Proceedings of the third ACM international conference on Multimedia
Fast parallel similarity search in multimedia databases

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Efficient disk allocation for fast similarity searching

Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Multidimensional access methods

ACM Computing Surveys (CSUR)
Disk allocation for Cartesian product files on multiple-disk systems

ACM Transactions on Database Systems (TODS)
(Almost) optimal parallel block access to range queries

PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Approximation algorithms for data placement on parallel disks

SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Fast concurrent access to parallel disks

SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Declustering using fractals

PDIS '93 Proceedings of the second international conference on Parallel and distributed information systems
From discrepancy to declustering: near-optimal multidimensional declustering strategies for range queries

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Cyclic Allocation of Two-Dimensional Data

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Optimal Allocation of Two-Dimensional Data

ICDT '97 Proceedings of the 6th International Conference on Database Theory
Study of Scalable Declustering Algorithms for Parallel Grid Files

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
CMD: A Multidimensional Declustering Method for Parallel Data Systems

VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Dynamic Declustering Methods for Parallel Grid Files

Proceedings of the Third International ACPC Conference with Special Emphasis on Parallel Databases and Parallel I/O: Parallel Computation
A General Multidimensional Data Allocation Method for Multicomputer Database Systems

DEXA '97 Proceedings of the 8th International Conference on Database and Expert Systems Applications
Concentric Hyperspaces and Disk Allocation for Fast Parallel Range Searching

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Declustering Using Golden Ratio Sequences

ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Optimal Parallel I/O Using Replication

ICPPW '02 Proceedings of the 2002 International Conference on Parallel Processing Workshops
Replicated declustering for arbitrary queries

Proceedings of the 2004 ACM symposium on Applied computing

Efficient retrieval of replicated data

Distributed and Parallel Databases
Efficient parallel processing of range queries through replicated declustering

Distributed and Parallel Databases
Data space mapping for efficient I/O in large multi-dimensional databases

Information Systems
Threshold-based declustering

Information Sciences: an International Journal
Equivalent disk allocations

Proceedings of the 2007 ACM symposium on Applied computing
Divide-and-conquer scheme for strictly optimal retrieval of range queries

ACM Transactions on Storage (TOS)
Threshold based declustering in high dimensions

DEXA'05 Proceedings of the 16th international conference on Database and Expert Systems Applications
Generalized Optimal Response Time Retrieval of Replicated Data from Storage Arrays

ACM Transactions on Storage (TOS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

The problem of disk declustering is to distribute data among multiple disks to reduce query response times through parallel I/O. A strictly optimal declustering technique is one that achieves optimal parallel I/O for all possible queries. In this paper, we focus on techniques that are optimized for spatial range queries. Current declustering techniques, which have single copies of the data, have been shown to be suboptimal for range queries. The lower bound on extra disk accesses is proved to be Ω(log N) for N disks even in the restricted case of an N-by-N grid, and all current approaches have been trying to achieve this bound. Replication is a well-known and effective solution for several problems in databases, especially for availability and load balancing. In this paper, we explore the idea of replication in the context of declustering and propose a framework where strictly optimal parallel I/O is achievable using a small amount of replication. We provide some theoretical foundations for replicated declustering, e.g., a bound for number of copies for strict optimality on any number of disks, and propose a class of replicated declustering schemes, periodic allocations, which are shown to be strictly optimal. The results for optimal disk allocation are extended for larger number of disks by increasing replication. Our techniques and results are valid for any arbitrary a-by-b grids, and any declustering scheme can be further improved using our replication framework. Using the framework, we perform experiments to identify a strictly optimal disk access schedule for any given arbitrary range query. In addition to the theoretical bounds, we compare the proposed replication based scheme to other existing techniques by performing experiments on real datasets.