Iterative-improvement-based declustering heuristics for multi-disk databases

Authors:
Mehmet Koyutürk;Cevdet Aykanat
Affiliations:
Department of Computer Sciences, Purdue University, West Lafayette, IN;Computer Engineering Department, Bilkent University, Ankara 06800, Turkey
Venue:
Information Systems
Year:
2005

Citing 27
Cited 12

Optimal file distribution for partial match retrieval

SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Multiple-Way Network Partitioning

IEEE Transactions on Computers
The design and analysis of spatial data structures

The design and analysis of spatial data structures
Combinatorial algorithms for integrated circuit layout

Combinatorial algorithms for integrated circuit layout
Linear clustering of objects with multiple attributes

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Fast multiresolution image querying

SIGGRAPH '95 Proceedings of the 22nd annual conference on Computer graphics and interactive techniques
Declustering of key-based partitioned signature files

ACM Transactions on Database Systems (TODS)
Partitioning similarity graphs: a framework for declustering problems

Information Systems
Fast parallel similarity search in multimedia databases

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Browsing and placement of multiresolution images on parallel disks

Proceedings of the fifth workshop on I/O in parallel and distributed systems
Hypergraph-Partitioning-Based Decomposition for Parallel Sparse-Matrix Vector Multiplication

IEEE Transactions on Parallel and Distributed Systems
Graph partitioning models for parallel computing

Parallel Computing - Special issue on graph partioning and parallel computing
Declustering using fractals

PDIS '93 Proceedings of the second international conference on Parallel and distributed information systems
Fundamentals of Computer Alori

Fundamentals of Computer Alori
Scalability Analysis of Declustering Methods for Multidimensional Range Queries

IEEE Transactions on Knowledge and Data Engineering
Declustering and Load-Balancing Methods for Parallelizing Geographic Information Systems

IEEE Transactions on Knowledge and Data Engineering
Resource Scheduling In A High-Performance Multimedia Server

IEEE Transactions on Knowledge and Data Engineering
Declustering Spatial Databases on a Multi-Computer Architecture

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Disk Allocation Methods for Parallelizing Grid Files

Proceedings of the Tenth International Conference on Data Engineering
Study of Scalable Declustering Algorithms for Parallel Grid Files

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
CMD: A Multidimensional Declustering Method for Parallel Data Systems

VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Declustering Objects for Visualization

VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
A Data Layout Strategy for Parallel Web Servers

Euro-Par '98 Proceedings of the 4th International Euro-Par Conference on Parallel Processing
A linear-time heuristic for improving network partitions

DAC '82 Proceedings of the 19th Design Automation Conference
Efficient Retrieval of Multidimensional Datasets through Parallel I/O

HIPC '98 Proceedings of the Fifth International Conference on High Performance Computing
Parallel Independent Grid Files Based on a Dynamic Declustering Method Using Multiple Error Correcting Codes

Parallel Independent Grid Files Based on a Dynamic Declustering Method Using Multiple Error Correcting Codes
Two novel multiway circuit partitioning algorithms using relaxed locking

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Threshold-based declustering

Information Sciences: an International Journal
Multi-level direct K-way hypergraph partitioning with multiple constraints and fixed vertices

Journal of Parallel and Distributed Computing
Selective Replicated Declustering for Arbitrary Queries

Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Divide-and-conquer scheme for strictly optimal retrieval of range queries

ACM Transactions on Storage (TOS)
A link-based storage scheme for efficient aggregate query processing on clustered road networks

Information Systems
Hypergraph Cuts & Unsupervised Representation for Image Segmentation

Fundamenta Informaticae
Efficient successor retrieval operations for aggregate query processing on clustered road networks

Information Sciences: an International Journal
Dynamic interaction in knowledge based systems: An exploratory investigation and empirical evaluation

Decision Support Systems
Schism: a workload-driven approach to database replication and partitioning

Proceedings of the VLDB Endowment
A reductive approach to hypergraph clustering: An application to image segmentation

Pattern Recognition
Generalized Optimal Response Time Retrieval of Replicated Data from Storage Arrays

ACM Transactions on Storage (TOS)
Random walks in directed hypergraphs and application to semi-supervised image segmentation

Computer Vision and Image Understanding

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data declustering is an important issue for reducing query response times in multi-disk database systems. In this paper, we propose a declustering method that utilizes the available information on query distribution, data distribution, data-item sizes, and disk capacity constraints. The proposed method exploits the natural correspondence between a data set with a given query distribution and a hypergraph. We define an objective function that exactly represents the aggregate parallel query-response time for the declustering problem and adapt the iterative-improvement-based heuristics successfully used in hypergraph partitioning to this objective function. We propose a two-phase algorithm that first obtains an initial K-way declustering by recursively bipartitioning the data set, then applies multiway refinement on this declustering. We provide effective gain models and efficient implementation schemes for both phases. The experimental results on a wide range of realistic data sets show that the proposed method provides a significant performance improvement compared with the state-of-the-art declustering strategy based on similarity-graph partitioning.