Jgroup-ARM: a distributed object group platform with autonomous replication management

Authors:
Hein Meling;Alberto Montresor;Bjarne E. Helvik;Ozalp Babaoglu
Affiliations:
Department of Electrical Engineering and Computer Science, University of Stavanger, 4036 Stavanger, Norway;Department of Information and Communication Technology, University of Trento, via Sommarive 14, 38050 Povo, Italy;Centre for Quantifiable Quality of Service in Communication Systems (Q2S), Norwegian University of Science and Technology, O.S. Bragstads plass 2E, 7491 Trondheim, Norway;Department of Computer Science, University of Bologna, Mura Anteo Zamboni 7, 40127 Bologna, Italy
Venue:
Software—Practice & Experience
Year:
2008

Citing 35
Cited 6

Consistency in a partitioned network: a survey

ACM Computing Surveys (CSUR)
Concurrency control and recovery in database systems

Concurrency control and recovery in database systems
Exploiting virtual synchrony in distributed systems

SOSP '87 Proceedings of the eleventh ACM Symposium on Operating systems principles
Simulation methodology for statisticians, operations analysts, and engineers: vol. 1

Simulation methodology for statisticians, operations analysts, and engineers: vol. 1
The process group approach to reliable distributed computing

Communications of the ACM
Impossibility of distributed consensus with one faulty process

Journal of the ACM (JACM)
Managing update conflicts in Bayou, a weakly connected replicated storage system

SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Unreliable failure detectors for reliable distributed systems

Journal of the ACM (JACM)
The implementation of a CORBA object group service

Theory and Practice of Object Systems - Special issue high availability in CORBA
Client-Access Protocols for Replicated Services

IEEE Transactions on Software Engineering
Indulgent algorithms (preliminary version)

Proceedings of the nineteenth annual ACM symposium on Principles of distributed computing
The Byzantine Generals Problem

ACM Transactions on Programming Languages and Systems (TOPLAS)
Group Communication in Partitionable Systems: Specification and Algorithms

IEEE Transactions on Software Engineering
Group communication specifications: a comprehensive study

ACM Computing Surveys (CSUR)
On group communication in large-scale distributed systems

EW 6 Proceedings of the 6th workshop on ACM SIGOPS European workshop: Matching operating systems to application needs
Surviving Network Partitioning

Computer
Distributed Fault Tolerance: Lessons from Delta-4

IEEE Micro
AQuA: An Adaptive Architecture that Provides Dependable Distributed Objects

IEEE Transactions on Computers
Towards Upgrading Actively Replicated Servers On-the-Fly

COMPSAC '02 Proceedings of the 26th International Computer Software and Applications Conference on Prolonging Software Life: Development and Redevelopment
Reconciling Replication and Transactions for the End-to-End Reliability of CORBA Applications

On the Move to Meaningful Internet Systems, 2002 - DOA/CoopIS/ODBASE 2002 Confederated International Conferences DOA, CoopIS and ODBASE 2002
A Low Latency, Loss Tolerant Architecture and Protocol for Wide Area Group Communication

DSN '00 Proceedings of the 2000 International Conference on Dependable Systems and Networks (formerly FTCS-30 and DCCA-8)
Replicating CORBA objects: a marriage between active and passive replication

Proceedings of the IFIP WG 6.1 International Working Conference on Distributed Applications and Interoperable Systems II
Design and implemantation of a CORBA fault-tolerant object group service

Proceedings of the IFIP WG 6.1 International Working Conference on Distributed Applications and Interoperable Systems II
Eternal: a component-based framework for transparent fault-tolerant CORBA

Software—Practice & Experience - Special issue: Enterprise frameworks
DOORS: Towards High-Performance Fault Tolerant CORBA

DOA '00 Proceedings of the International Symposium on Distributed Objects and Applications
System Support for Partition-Aware Network Applications

ICDCS '98 Proceedings of the The 18th International Conference on Distributed Computing Systems
Large-Scale Simulation of Replica Placement Algorithms for a Serverless Distributed File System

MASCOTS '01 Proceedings of the Ninth International Symposium in Modeling, Analysis and Simulation of Computer and Telecommunication Systems
Building Replicated Internet Services Using TACT: A Toolkit for Tunable Availability and Consistency Tradeoffs

WECWIS '00 Proceedings of the Second International Workshop on Advance Issues of E-Commerce and Web-Based Information Systems (WECWIS 2000)
The ensemble system

The ensemble system
Transparent fault tolerance for java remote method invocation

Transparent fault tolerance for java remote method invocation
Experiences, Strategies, and Challenges in Building Fault-Tolerant CORBA Systems

IEEE Transactions on Computers
A Global-State-Triggered Fault Injector for Distributed System Evaluation

IEEE Transactions on Parallel and Distributed Systems
Autonomic Computing

Autonomic Computing
Preventing orphan requests by integrating replication and transactions

ADBIS'07 Proceedings of the 11th East European conference on Advances in databases and information systems
An approach to experimentally obtain service dependability characteristics of the Jgroup/ARM system

EDCC'05 Proceedings of the 5th European conference on Dependable Computing

Type-safe dynamic protocol composition in Jgroup/ARM

Proceedings of the 3rd International DiscCoTec Workshop on Middleware-Application Interaction
Foraging for Better Deployment of Replicated Service Components

DAIS '09 Proceedings of the 9th IFIP WG 6.1 International Conference on Distributed Applications and Interoperable Systems
Ant system for service deployment in private and public clouds

Proceedings of the 2nd workshop on Bio-inspired algorithms for distributed systems
FTRMI: fault-tolerant transparent RMI

Proceedings of the 27th Annual ACM Symposium on Applied Computing
Transparently increasing RMI fault tolerance

ACM SIGAPP Applied Computing Review
Enhancing group communication with self-manageable behavior

Journal of Parallel and Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents the design and implementation of Jgroup-ARM, a distributed object group platform with autonomous replication management along with a novel measurement-based assessment technique that is used to validate the fault-handling capability of Jgroup-ARM. Jgroup extends Java RMI through the group communication paradigm and has been designed specifically for application support in partitionable systems. ARM aims at improving the dependability characteristics of systems through a fault-treatment mechanism. Hence, ARM focuses on deployment and operational aspects, where the gain in terms of improved dependability is likely to be the greatest. The main objective of ARM is to localize failures and to reconfigure the system according to application-specific dependability requirements. Combining Jgroup and ARM can significantly reduce the effort necessary for developing, deploying and managing dependable, partition-aware applications. Jgroup-ARM is evaluated experimentally to validate its fault-handling capability; the recovery performance of a system deployed in a wide area network is evaluated. In this experiment multiple nearly coincident reachability changes are injected to emulate network partitions separating the service replicas. The results show that Jgroup-ARM is able to recover applications to their initial state in several realistic failure scenarios, including multiple, concurrent network partitionings. Copyright © 2007 John Wiley & Sons, Ltd.