Hash-based labeling techniques for storage scaling

Authors:
D. Yao;Cyrus Shahabi;Per-Åke Larson
Affiliations:
Computer Science Department, University of Southern California, USA;Computer Science Department, University of Southern California, USA;Microsoft Corporation, USA
Venue:
The VLDB Journal — The International Journal on Very Large Data Bases
Year:
2005

Citing 23
Cited 1

Dynamic hash tables

Communications of the ACM
Parallel database systems: the future of high performance database systems

Communications of the ACM
LH*—a scalable, distributed data structure

ACM Transactions on Database Systems (TODS)
Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web

STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
RIO: a real-time multimedia object server

ACM SIGMETRICS Performance Evaluation Review - Special issue on multimedia storage systems
Continuous display using heterogeneous disk-subsystems

MULTIMEDIA '97 Proceedings of the fifth ACM international conference on Multimedia
Using name-based mappings to increase hit rates

IEEE/ACM Transactions on Networking (TON)
Performance analysis of the RIO multimedia storage system with heterogeneous disk configurations

MULTIMEDIA '98 Proceedings of the sixth ACM international conference on Multimedia
Comparing random data allocation and data striping in multimedia servers

Proceedings of the 2000 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Scatter storage techniques

Communications of the ACM
Chord: A scalable peer-to-peer lookup service for internet applications

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
A scalable content-addressable network

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
The Art of Computer Programming Volumes 1-3 Boxed Set

The Art of Computer Programming Volumes 1-3 Boxed Set
Multimedia Information Storage and Management

Multimedia Information Storage and Management
Yima: A Second-Generation Continuous Media Server

Computer
Prototyping Bubba, A Highly Parallel Database System

IEEE Transactions on Knowledge and Data Engineering
The Gamma Database Machine Project

IEEE Transactions on Knowledge and Data Engineering
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems

Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
On-line Reorganization of Data in Scalable Continuous Media Servers

DEXA '96 Proceedings of the 7th International Conference on Database and Expert Systems Applications
Data placement in shared-nothing parallel database systems

The VLDB Journal — The International Journal on Very Large Data Bases
Randomized Data Allocation for Real-time Disk I/O

COMPCON '96 Proceedings of the 41st IEEE International Computer Conference
Reliability Mechanisms for Very Large Storage Systems

MSS '03 Proceedings of the 20 th IEEE/11 th NASA Goddard Conference on Mass Storage Systems and Technologies (MSS'03)
SCADDAR: An Efficient Randomized Technique to Reorganize Continuous Media Blocks

ICDE '02 Proceedings of the 18th International Conference on Data Engineering

A novel distributed architecture of large-scale multimedia storage system using autonomous object-based storage devices

Journal of Parallel and Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Scalable storage architectures allow for the addition or removal of storage devices to increase storage capacity and bandwidth or retire older devices. Assuming random placement of data objects across multiple storage devices of a storage pool, our optimization objective is to redistribute a minimum number of objects after scaling the pool. In addition, a uniform distribution, and hence a balanced load, should be ensured after redistribution. Moreover, the redistributed objects should be retrieved efficiently during the normal mode of operation: in one I/O access and with low complexity computation. To achieve this, we propose an algorithm called random disk labeling (RDL), based on double hashing, where storage can be added or removed without any increase in complexity. We compare RDL with other proposed techniques and demonstrate its effectiveness through experimentation.