Numerical recipes in C: the art of scientific computing
Numerical recipes in C: the art of scientific computing
Concurrent maintenance of data systems for telecommunications
The Computer Journal
Hashing practice: analysis of hashing and universal hashing
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Gray Codes for Partial Match and Range Queries
IEEE Transactions on Software Engineering
The R*-tree: an efficient and robust access method for points and rectangles
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Linear clustering of objects with multiple attributes
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Research issues in spatial databases
ACM SIGMOD Record - Directions for future database research & development
SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Beyond uniformity and independence: analysis of R-trees using the concept of fractal dimension
PODS '94 Proceedings of the thirteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Searching in Parallel for Similar Strings
IEEE Computational Science & Engineering
Distributing a search tree among a growing number of processors
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Performance and reliability analysis of computer systems: an example-based approach using the SHARPE software package
Efficient Hardware Hashing Functions for High Performance Computers
IEEE Transactions on Computers
The Grid File: An Adaptable, Symmetric Multikey File Structure
ACM Transactions on Database Systems (TODS)
Efficient locking for concurrent operations on B-trees
ACM Transactions on Database Systems (TODS)
Data Structures for Range Searching
ACM Computing Surveys (CSUR)
An efficient method for distributing search structures
PDIS '91 Proceedings of the first international conference on Parallel and distributed information systems
Distribution-dependent hashing functions and their characteristics
SIGMOD '75 Proceedings of the 1975 ACM SIGMOD international conference on Management of data
Clustering Algorithms
Database Design
Computer Vision
R-trees: a dynamic index structure for spatial searching
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Multidimensional Indexing for Recognizing Visual Shapes
IEEE Transactions on Pattern Analysis and Machine Intelligence
Heterogeneous Distributed Shared Memory
IEEE Transactions on Parallel and Distributed Systems
Methodical Analysis of Adaptive Load Sharing Algorithms
IEEE Transactions on Parallel and Distributed Systems
Prediction-Based Dynamic Load-Sharing Heuristics
IEEE Transactions on Parallel and Distributed Systems
Strategies for Dynamic Load Balancing on Highly Parallel Computers
IEEE Transactions on Parallel and Distributed Systems
A taxonomy of scheduling in general-purpose distributed computing systems
IEEE Transactions on Software Engineering
Smoothing and Matching of 3-D Space Curves
ECCV '92 Proceedings of the Second European Conference on Computer Vision
The R+-Tree: A Dynamic Index for Multi-Dimensional Objects
VLDB '87 Proceedings of the 13th International Conference on Very Large Data Bases
Universality of Serial Histograms
VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
Well-Behaved, Tunable 3D-Affine Invariants
CVPR '98 Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Load balancing in homogeneous broadcast distributed systems
Proceedings of the Computer Network Performance Symposium
Radio-wave propagation prediction using ray-tracing techniques on a network of workstations (NOW)
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
Increasingly larger data sets are being stored in networked architectures. Many of the available data structures are not easily amenable to parallel realizations. Hashing schemes show promise in that respect for the simple reason that the underlying data structure can be decomposed and spread among the set of cooperating nodes with minimal communication and maintenance requirements. In all cases, storage utilization and load balancing are issues that need to be addressed. One can identify two basic approaches to tackle the problem. One way is to address it as part of the design of the data structure that is used to store and retrieve the data. The other is to maintain the data structure intact but address the problem separately. The method that we present here falls in the latter category and is applicable whenever a hash table is the preferred data structure. Intrinsically attached to the used hash table is a hashing function that allows one to partition a possibly unbounded set of data items into a finite set of groups; the hashing function provides the partitioning by assigning each data item to one of the groups. In general, the hashing function cannot guarantee that the various groups will have the same cardinality, on average, for all possible data item distributions. In this paper, we propose a two-stage methodology that uses the knowledge of the hashing function to reorganize the group assignments so that the resulting groups have similar expected cardinalities. The method is generally applicable and independent of the used hashing function. We show the power of the methodology using both synthetic and real-world databases. The derived quasi-uniform storage occupancy and associated load-balancing gains are significant.