Efficient declustering techniques for temporal access structures
ADC '01 Proceedings of the 12th Australasian database conference
Scalability Analysis of Declustering Methods for Multidimensional Range Queries
IEEE Transactions on Knowledge and Data Engineering
Declustering and Load-Balancing Methods for Parallelizing Geographic Information Systems
IEEE Transactions on Knowledge and Data Engineering
Spatial Databases-Accomplishments and Research Needs
IEEE Transactions on Knowledge and Data Engineering
Analysis and Comparison of Declustering Schemes for Interactive Navigation Queries
IEEE Transactions on Knowledge and Data Engineering
Efficient Join-Index-Based Spatial-Join Processing: A Clustering Approach
IEEE Transactions on Knowledge and Data Engineering
Study of Scalable Declustering Algorithms for Parallel Grid Files
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Hierarchical Declustering Schemes for Range Queries
EDBT '00 Proceedings of the 7th International Conference on Extending Database Technology: Advances in Database Technology
Asymptotically optimal declustering schemes for 2-dim range queries
Theoretical Computer Science - Database theory
New GDM-Based Declustering Methods for Parallel Range Queries
IDEAS '99 Proceedings of the 1999 International Symposium on Database Engineering & Applications
(Almost) Optimal parallel block access for range queries
Information Sciences—Informatics and Computer Science: An International Journal
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Design of a next generation sampling service for large scale data analysis applications
Proceedings of the 19th annual international conference on Supercomputing
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Hi-index | 0.00 |
We propose a new similarity-based technique for declustering data. The proposed method can adapt to available information about query distributions, data distributions, data sizes and partition-size constraints. The method is based on max-cut partitioning of a similarity graph defined over the given set of data, under constraints on the partition sizes. It maximizes the chances that a pair of data-items that are to be accessed together by queries are allocated to distinct disks. We show that the proposed method can achieve optimal speed-up for a query-set, if there exists any other declustering method which will achieve the optimal speed-up. Experiments in parallelizing grid files show that the proposed method outperforms mapping-function-based methods for interesting query distributions as well for non-uniform data distributions.