Optimal partial-match retrieval when fields are independently specified
ACM Transactions on Database Systems (TODS)
Partial-match hash coding: benefits of redundancy
ACM Transactions on Database Systems (TODS)
Attribute based file organization in a paged memory environment
Communications of the ACM
Some properties of Cartesian product files
SIGMOD '80 Proceedings of the 1980 ACM SIGMOD international conference on Management of data
Associative searching in multiple storage units
ACM Transactions on Database Systems (TODS)
Optimal file distribution for partial match retrieval
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Declustering using error correcting codes
PODS '89 Proceedings of the eighth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
A parallel algorithm for record clustering
ACM Transactions on Database Systems (TODS)
Parallel Processing of large node B-trees
IEEE Transactions on Computers
Disk Allocation Methods Using Error Correcting Codes
IEEE Transactions on Computers
Parallel main memory database system
SAC '92 Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing: technological challenges of the 1990's
Optimal disk allocation for partial match queries
ACM Transactions on Database Systems (TODS)
Declustering of key-based partitioned signature files
ACM Transactions on Database Systems (TODS)
Fast parallel similarity search in multimedia databases
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Efficient disk allocation for fast similarity searching
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
On the optimality of disk allocation for Cartesian product files (extended abstract)
PODS '90 Proceedings of the ninth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Efficient input and output for scientific simulations
Proceedings of the sixth workshop on I/O in parallel and distributed systems
Clustering declustered data for efficient retrieval
Proceedings of the eighth international conference on Information and knowledge management
Parallel searching for binary Cartesian product files
CSC '85 Proceedings of the 1985 ACM thirteenth annual conference on Computer Science
(Almost) optimal parallel block access to range queries
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
GeMDA: A Multidimensional Data Partitioning Technique for Multiprocessor Database Systems
Distributed and Parallel Databases
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Efficient declustering techniques for temporal access structures
ADC '01 Proceedings of the 12th Australasian database conference
A Hypergraph Based Approach to Declustering Problems
Distributed and Parallel Databases
Optimal Bucket Allocation Design of k-ary MKH Files for Partial Match Retrieval
IEEE Transactions on Knowledge and Data Engineering
Scalability Analysis of Declustering Methods for Multidimensional Range Queries
IEEE Transactions on Knowledge and Data Engineering
Declustering and Load-Balancing Methods for Parallelizing Geographic Information Systems
IEEE Transactions on Knowledge and Data Engineering
Analysis and Comparison of Declustering Schemes for Interactive Navigation Queries
IEEE Transactions on Knowledge and Data Engineering
A Stochastic Programming Approach for Range Query Retrieval Problems
IEEE Transactions on Knowledge and Data Engineering
MAGIC: A Multiattribute Declustering Mechanism for Multiprocessor Database Machines
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Asymptotically Optimal Declustering Schemes for Range Queries
ICDT '01 Proceedings of the 8th International Conference on Database Theory
Study of Scalable Declustering Algorithms for Parallel Grid Files
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
The Idea of De-Clustering and its Applications
VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Hybrid-Range Partitioning Strategy: A New Declustering Strategy for Multiprocessor Database Machines
VLDB '90 Proceedings of the 16th International Conference on Very Large Data Bases
CMD: A Multidimensional Declustering Method for Parallel Data Systems
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Declustering Objects for Visualization
VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
A Non-Uniform Data Fragmentation Strategy for Parallel Main-Menory Database Systems
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Low Discrepancy Allocation of Two-Dimensional Data
FoIKS '00 Proceedings of the First International Symposium on Foundations of Information and Knowledge Systems
Optimal Partitioning for Efficient I/O in Spatial Databases
Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
Declustering Spatial Objects by Clustering for Parallel Disks
DEXA '01 Proceedings of the 12th International Conference on Database and Expert Systems Applications
Optimal Parallel I/O for Range Queries through Replication
DEXA '02 Proceedings of the 13th International Conference on Database and Expert Systems Applications
Hierarchical Declustering Schemes for Range Queries
EDBT '00 Proceedings of the 7th International Conference on Extending Database Technology: Advances in Database Technology
Data partitioning and load balancing in parallel disk systems
The VLDB Journal — The International Journal on Very Large Data Bases
Multidimensional Declustering Schemes Using Golden Ratio and Kronecker Sequences
IEEE Transactions on Knowledge and Data Engineering
Asymptotically optimal declustering schemes for 2-dim range queries
Theoretical Computer Science - Database theory
New GDM-Based Declustering Methods for Parallel Range Queries
IDEAS '99 Proceedings of the 1999 International Symposium on Database Engineering & Applications
Disk Allocation for Fast Range and Nearest-Neighbor Queries
Distributed and Parallel Databases
(Almost) Optimal parallel block access for range queries
Information Sciences—Informatics and Computer Science: An International Journal
Replicated declustering for arbitrary queries
Proceedings of the 2004 ACM symposium on Applied computing
Distributing a database for parallel processing is NP-hard
ACM SIGMOD Record
Replicated declustering of spatial data
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Optimal data-space partitioning of spatial data for parallel I/O
Distributed and Parallel Databases
Design of a next generation sampling service for large scale data analysis applications
Proceedings of the 19th annual international conference on Supercomputing
Efficient retrieval of replicated data
Distributed and Parallel Databases
Exploiting sequential access when declustering data over disks and MEMS-based storage
Distributed and Parallel Databases
Efficient parallel processing of range queries through replicated declustering
Distributed and Parallel Databases
IEEE Transactions on Knowledge and Data Engineering
Improved bounds and schemes for the declustering problem
Theoretical Computer Science
Data space mapping for efficient I/O in large multi-dimensional databases
Information Systems
Information Sciences: an International Journal
Proceedings of the 2007 ACM symposium on Applied computing
Latin squares and low discrepancy allocation of two-dimensional data
European Journal of Combinatorics
Divide-and-conquer scheme for strictly optimal retrieval of range queries
ACM Transactions on Storage (TOS)
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
A study on grid partition for declustering high-dimensional data
ADVIS'04 Proceedings of the Third international conference on Advances in Information Systems
Threshold based declustering in high dimensions
DEXA'05 Proceedings of the 16th international conference on Database and Expert Systems Applications
Optimal distributed declustering using replication
ICDT'05 Proceedings of the 10th international conference on Database Theory
Generalized Optimal Response Time Retrieval of Replicated Data from Storage Arrays
ACM Transactions on Storage (TOS)
Hi-index | 0.01 |
Cartesian product files have recently been shown to exhibit attractive properties for partial match queries. This paper considers the file allocation problem for Cartesian product files, which can be stated as follows: Given a k-attribute Cartesian product file and an m-disk system, allocate buckets among the m disks in such a way that, for all possible partial match queries, the concurrency of disk accesses is maximized. The Disk Modulo (DM) allocation method is described first, and it is shown to be strict optimal under many conditions commonly occurring in practice, including all possible partial match queries when the number of disks is 2 or 3. It is also shown that although it has good performance, the DM allocation method is not strict optimal for all possible partial match queries when the number of disks is greater than 3. The General Disk Modulo (GDM) allocation method is then described, and a sufficient but not necessary condition for strict optimality of the GDM method for all partial match queries and any number of disks is then derived. Simulation studies comparing the DM and random allocation methods in terms of the average number of disk accesses, in response to various classes of partial match queries, show the former to be significantly more effective even when the number of disks is greater than 3, that is, even in cases where the DM method is not strict optimal. The results that have been derived formally and shown by simulation can be used for more effective design of optimal file systems for partial match queries. When considering multiple-disk systems with independent access paths, it is important to ensure that similar records are clustered into the same or similar buckets, while similar buckets should be dispersed uniformly among the disks.