Determining Redundancy Levels for Fault Tolerant Real-Time Systems

Authors:
Fuxing Wang;Krithi Ramamritham;John A. Stankovic
Affiliations:
-;-;-
Venue:
IEEE Transactions on Computers - Special issue on fault-tolerant computing
Year:
1995

Citing 5
Cited 20

A fault-tolerant scheduling problem

IEEE Transactions on Software Engineering
On Scheduling Tasks with a Quick Recovery from Failure

IEEE Transactions on Computers
Optimal reconfiguration strategy for a degradable multimodule computing system

Journal of the ACM (JACM)
Simple and integrated heuristic algorithms for scheduling tasks with time and resource constraints

Journal of Systems and Software
Real-time Systems Performance in the Presence of Failures

Computer - Special issue on real-time systems

A Fault-Tolerant Dynamic Scheduling Algorithm for Multiprocessor Real-Time Systems and Its Analysis

IEEE Transactions on Parallel and Distributed Systems
COFTA: Hardware-Software Co-Synthesis of Heterogeneous Distributed Embedded Systems for Low Overhead Fault Tolerance

IEEE Transactions on Computers
An Optimal Value-Based Admission Policy and its ReflectiveUse in Real-Time Systems

Real-Time Systems
An Adaptive Scheme for Fault-Tolerant Scheduling of Soft Real-Time Tasks in Multiprocessor Systems

HiPC '01 Proceedings of the 8th International Conference on High Performance Computing
A Generalized Analytic Performance Model of Distributed Systems that Perform N Tasks Using P Fault-Prone Processors

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
A Reliability-Aware Value-Based Scheduler for Dynamic Multiprocessor Real-Time Systems

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
FARM: A Feedback-Based Adaptive Resource Management for Autonomous Hot-Spot Convergence System

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Value-Driven Resource Assignment in Object-Oriented Real-Time Dependable Systems

WORDS '97 Proceedings of the 3rd Workshop on Object-Oriented Real-Time Dependable Systems - (WORDS '97)
Scheduling Algorithms Exploiting Spare Capacity and Tasks' Laxities for Fault Detection and Location in Real-time Multiprocessor Systems

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
An adaptive scheme for fault-tolerant scheduling of soft real-time tasks in multiprocessor systems

Journal of Parallel and Distributed Computing
Efficient task replication and management for adaptive fault tolerance in mobile Grid environments

Future Generation Computer Systems - Special section: Information engineering and enterprise architecture in distributed computing environments
A fault-tolerant approach to test control utilizing dual-redundant processors

Advances in Engineering Software
FLARe: a Fault-tolerant Lightweight Adaptive Real-time middleware for distributed real-time and embedded systems

Proceedings of the 4th on Middleware doctoral symposium
Scheduling of fault-tolerant embedded systems with soft and hard timing constraints

Proceedings of the conference on Design, automation and test in Europe
A highly available job execution service in computational service market

GRID '07 Proceedings of the 8th IEEE/ACM International Conference on Grid Computing
Towards middleware for fault-tolerance in distributed real-time and embedded systems

DAIS'08 Proceedings of the 8th IFIP WG 6.1 international conference on Distributed applications and interoperable systems
Scheduling for real-time mobile MapReduce systems

Proceedings of the 5th ACM international conference on Distributed event-based system
Design and analysis of a novel load-balancing model based on mobile agent

ICMLC'05 Proceedings of the 4th international conference on Advances in Machine Learning and Cybernetics
A task replication and fair resource management scheme for fault tolerant grids

EGC'05 Proceedings of the 2005 European conference on Advances in Grid Computing
Achieving high job execution reliability using underutilized resources in a computational economy

Future Generation Computer Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many real-time systems have both performance requirements and reliability requirements. Performance is usually measured in terms of the value in completing tasks on time. Reliability is evaluated by hardware and software failure models. In many situations, there are trade-offs between task performance and task reliability. Thus, a mathematical assessment of performance-reliability trade-offs is necessary toevaluate theperformance of real-time fault-tolerance systems. Assuming that the reliability of task execution is achieved through task replication, we present an approach that mathematically determines the replication factor for tasks. Our approach is novel in that it is a task schedule based analysis rather than a state based analysis as found in other models. Because we use a task schedule based analysis, we can provide a fast method to determine optimal redundancy levels, we are not limited to hardware reliability given by constant failure rate functions as in most other models, andwe hypothesize that we can more naturally integrate with on-line real-time scheduling than when state based techniques are used. In this work, the goal is to maximize the total performance index, which is a performance-related reliability measurement. We present a technique based on a continuous task model and show how it very closely approximates discrete models and tasks with varying characteristics.Index Terms驴Real-time systems, reliability, degradable systems, fault tolerance, functional variation, performability.