Parallel free-text search on the connection machine system
Communications of the ACM - Special issue on parallelism
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
A case for redundant arrays of inexpensive disks (RAID)
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
The Grid File: An Adaptable, Symmetric Multikey File Structure
ACM Transactions on Database Systems (TODS)
Disk allocation for Cartesian product files on multiple-disk systems
ACM Transactions on Database Systems (TODS)
Optimal partial-match retrieval when fields are independently specified
ACM Transactions on Database Systems (TODS)
Parallel searching for binary Cartesian product files
CSC '85 Proceedings of the 1985 ACM thirteenth annual conference on Computer Science
Attribute based file organization in a paged memory environment
Communications of the ACM
GAMMA - A High Performance Dataflow Database Machine
VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Semantic complexity of classes of relational queries and query independent data partitioning
PODS '91 Proceedings of the tenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
Optimal disk allocation for partial match queries
ACM Transactions on Database Systems (TODS)
Efficient disk allocation for fast similarity searching
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
On the optimality of disk allocation for Cartesian product files (extended abstract)
PODS '90 Proceedings of the ninth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
CMD: A Multidimensional Declustering Method for Parallel Data Systems
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Hamming Filters: A Dynamic Signature File Organization for Parallel Stores
VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
Optimal Parallel I/O for Range Queries through Replication
DEXA '02 Proceedings of the 13th International Conference on Database and Expert Systems Applications
Disk Allocation for Fast Range and Nearest-Neighbor Queries
Distributed and Parallel Databases
Replicated declustering for arbitrary queries
Proceedings of the 2004 ACM symposium on Applied computing
Replicated declustering of spatial data
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Efficient retrieval of replicated data
Distributed and Parallel Databases
Efficient parallel processing of range queries through replicated declustering
Distributed and Parallel Databases
Information Sciences: an International Journal
Proceedings of the 2007 ACM symposium on Applied computing
Divide-and-conquer scheme for strictly optimal retrieval of range queries
ACM Transactions on Storage (TOS)
Threshold based declustering in high dimensions
DEXA'05 Proceedings of the 16th international conference on Database and Expert Systems Applications
Hi-index | 0.00 |
The problem examined is to distribute a binary Cartesian product file on multiple disks to maximize the parallelism for partial match queries. Cartesian product files appear as a result of some secondary key access methods, such as the multiattribute hashing [10], the grid file [6] etc.. For the binary case, the problem is reduced into grouping the 2n binary strings on n bits in m groups of unsimilar strings. The main idea proposed in this paper is to group the strings such that the group forms an Error Correcting Code (ECC). This construction guarantees that the strings of a given group will have large Hamming distances, i.e., they will differ in many bit positions. Intuitively, this should result into good declustering. We briefly mention previous heuristics for declustering, we describe how exactly to build a declustering scheme using an ECC, and we prove a theorem that gives a necessary condition for our method to be optimal. Analytical results show that our method is superior to older heuristics, and that it is very close to the theoretical (non-tight) bound.