Slead: low-memory, steady distributed systems slicing

Authors:
Francisco Maia;Miguel Matos;Etienne Rivière;Rui Oliveira
Affiliations:
High-Assurance Software Laboratory, INESC TEC & University of Minho, Portugal;High-Assurance Software Laboratory, INESC TEC & University of Minho, Portugal;Université de Neuchâtel, Switzerland;High-Assurance Software Laboratory, INESC TEC & University of Minho, Portugal
Venue:
DAIS'12 Proceedings of the 12th IFIP WG 6.1 international conference on Distributed Applications and Interoperable Systems
Year:
2012

Citing 12
Cited 0

Space/time trade-offs in hash coding with allowable errors

Communications of the ACM
Lightweight probabilistic broadcast

ACM Transactions on Computer Systems (TOCS)
Time-Decaying Bloom Filters for Data Streams with Skewed Distributions

RIDE '05 Proceedings of the 15th International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications
Scalable Bloom Filters

Information Processing Letters
Distributed Slicing in Dynamic Systems

ICDCS '07 Proceedings of the 27th International Conference on Distributed Computing Systems
Gossip-based peer sampling

ACM Transactions on Computer Systems (TOCS)
A fast distributed slicing algorithm

Proceedings of the twenty-seventh ACM symposium on Principles of distributed computing
Slicing Distributed Systems

IEEE Transactions on Computers
Aging Bloom Filter with Two Active Buffers for Dynamic Sets

IEEE Transactions on Knowledge and Data Engineering
mTreebone: A Collaborative Tree-Mesh Overlay Network for Multicast Video Streaming

IEEE Transactions on Parallel and Distributed Systems
ChurnDetect: a gossip-based churn estimator for large-scale dynamic networks

Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part II
An epidemic approach to dependable key-value substrates

DSNW '11 Proceedings of the 2011 IEEE/IFIP 41st International Conference on Dependable Systems and Networks Workshops

Quantified Score

Hi-index	0.00

Visualization

Abstract

Slicing a large-scale distributed system is the process of autonomously partitioning its nodes into k groups, named slices. Slicing is associated to an order on node-specific criteria, such as available storage, uptime, or bandwidth. Each slice corresponds to the nodes between two quantiles in a virtual ranking according to the criteria. For instance, a system can be split in three groups, one with nodes with the lowest uptimes, one with nodes with the highest uptimes, and one in the middle. Such a partitioning can be used by applications to assign different tasks to different groups of nodes, e.g., assigning critical tasks to the more powerful or stable nodes and less critical tasks to other slices. Assigning a slice to each node in a large-scale distributed system, where no global knowledge of nodes' criteria exists, is not trivial. Recently, much research effort was dedicated to guaranteeing a fast and correct convergence in comparison to a global sort of the nodes. Unfortunately, state-of-the-art slicing protocols exhibit flaws that preclude their application in real scenarios, in particular with respect to cost and stability. In this paper, we identify steadiness issues where nodes in a slice border constantly exchange slice and large memory requirements for adequate convergence, and provide practical solutions for the two. Our solutions are generic and can be applied to two different state-of-the-art slicing protocols with little effort and while preserving the desirable properties of each. The effectiveness of the proposed solutions is extensively studied in several simulated experiments.