Measurement and modeling of computer reliability as affected by system activity
ACM Transactions on Computer Systems (TOCS)
The reliability of life-critical computer systems
Acta Informatica
Recognition of error symptoms in large systems
ACM '86 Proceedings of 1986 ACM Fall joint computer conference
Probability and Statistics with Reliability, Queuing and Computer Science Applications
Probability and Statistics with Reliability, Queuing and Computer Science Applications
A compatible hardware/software reliability prediction model
A compatible hardware/software reliability prediction model
Measurement-based reliability/performability models
Measurement-based reliability/performability models
Performability Analysis: Measures, an Algorithm, and a Case Study
IEEE Transactions on Computers - Fault-Tolerant Computing
Performability Analysis Using Semi-Markov Reward Processes
IEEE Transactions on Computers
Analysis and Modeling of Correlated Failures in Multicomputer Systems
IEEE Transactions on Computers - Special issue on fault-tolerant computing
MEASURE+: a measurement-based dependability analysis package
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Modeling Correlation in Software Recovery Blocks
IEEE Transactions on Software Engineering - Special issue on software reliability
A Measurement-Based Model to Predict the Performance Impact of System Modifications: A Case Study
IEEE Transactions on Parallel and Distributed Systems
DEPEND: A Simulation-Based Environment for System Level Dependability Analysis
IEEE Transactions on Computers
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
The Completion Time of Programs on Processors Subject to Failure and Repair
IEEE Transactions on Computers
IEEE Transactions on Parallel and Distributed Systems
Measurement-based Analysis of Networked System Availability
Performance Evaluation: Origins and Directions
Measurement-Based Analysis of System Dependability Using Fault Injection and Field Failure Data
Performance Evaluation of Complex Systems: Techniques and Tools, Performance 2002, Tutorial Lectures
A Measurement-Based Model for Estimation of Resource Exhaustion in Operational Software Systems
ISSRE '99 Proceedings of the 10th International Symposium on Software Reliability Engineering
Proceedings of the 2004 ACM symposium on Applied computing
Model-Based Evaluation: From Dependability to Security
IEEE Transactions on Dependable and Secure Computing
A Comprehensive Model for Software Rejuvenation
IEEE Transactions on Dependable and Secure Computing
FTCS'95 Proceedings of the Twenty-Fifth international conference on Fault-tolerant computing
Hi-index | 0.01 |
A measurement-based performability model is described that is based on error and resource-usage data collected on a multiprocessor system. A method for identifying the model structure is introduced, and the resulting model is validated against real data. Model development from the collection of raw data to the estimation of the expected reward is described. Both normal behavior and error behavior of the system are characterized. The measured data show that the holding times in key operational and error states are not simple exponentials and that a semi-Markov process is necessary to model the system behavior. A reward function, which is based on the service rate and the error rate in each state, is defined in order to estimate the performability of the system and to depict the cost of different types of errors.