Goodness-of-fit techniques
SIGCOMM '95 Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication
Exploiting process lifetime distributions for dynamic load balancing
Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Exploiting process lifetime distributions for dynamic load balancing
ACM Transactions on Computer Systems (TOCS)
Why we don't know how to simulate the Internet
Proceedings of the 29th conference on Winter simulation
The grid: blueprint for a new computing infrastructure
The grid: blueprint for a new computing infrastructure
Load-balancing heuristics and process behavior
SIGMETRICS '86/PERFORMANCE '86 Proceedings of the 1986 ACM SIGMETRICS joint international conference on Computer performance modelling, measurement and evaluation
The MicroGrid: a scientific tool for modeling computational gridsr
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Chord: A scalable peer-to-peer lookup service for internet applications
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Numerical libraries and the grid: the GrADS experiments with ScaLAPACK
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
The Vision of Autonomic Computing
Computer
Cactus Application: Performance Predictions in Grid Environments
Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
Profiling Workstations' Available Capacity for Remote Execution
Performance '87 Proceedings of the 12th IFIP WG 7.3 International Symposium on Computer Performance Modelling, Measurement and Evaluation
Adaptive Computing on the Grid Using AppLeS
IEEE Transactions on Parallel and Distributed Systems
Simgrid: A Toolkit for the Simulation of Application Scheduling
CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
Experimental Assessment of Workstation Failures and Their Impact on Checkpointing Systems
FTCS '98 Proceedings of the The Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing
HPDC '03 Proceedings of the 12th IEEE International Symposium on High Performance Distributed Computing
A longitudinal survey of Internet host reliability
SRDS '95 Proceedings of the 14TH Symposium on Reliable Distributed Systems
High Performance Parametric Modeling with Nimrod/G: Killer Application for the Global Grid?
IPDPS '00 Proceedings of the 14th International Symposium on Parallel and Distributed Processing
The GrADS Project: Software Support for High-Level Grid Application Development
International Journal of High Performance Computing Applications
Software Reliability Models: Assumptions, Limitations, and Applicability
IEEE Transactions on Software Engineering
Effect of System Workload on Operating System Reliability: A Study on IBM 3081
IEEE Transactions on Software Engineering
Tapestry: a resilient global-scale overlay for service deployment
IEEE Journal on Selected Areas in Communications
Predicting bounds on queuing delay for batch-scheduled parallel machines
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
MJSA: Markov job scheduler based on availability in desktop grid computing environment
Future Generation Computer Systems
Using queue structures to improve job reliability
Proceedings of the 16th international symposium on High performance distributed computing
Ridge: combining reliability and performance in open grid platforms
Proceedings of the 16th international symposium on High performance distributed computing
Understanding disk failure rates: What does an MTTF of 1,000,000 hours mean to you?
ACM Transactions on Storage (TOS)
Designing less-structured P2P systems for the expected high churn
IEEE/ACM Transactions on Networking (TON)
On the dynamic resource availability in grids
GRID '07 Proceedings of the 8th IEEE/ACM International Conference on Grid Computing
An analysis of clustered failures on large supercomputing systems
Journal of Parallel and Distributed Computing
Exploring data reliability tradeoffs in replicated storage systems
Proceedings of the 18th ACM international symposium on High performance distributed computing
A Survey on Approximation Algorithms for Scheduling with Machine Unavailability
Algorithmics of Large and Complex Networks
Combined Fault Tolerance and Scheduling Techniques for Workflow Applications on Computational Grids
CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
Information Sciences: an International Journal
Dynamic scheduling for heterogeneous Desktop Grids
GRID '08 Proceedings of the 2008 9th IEEE/ACM International Conference on Grid Computing
International Journal of Parallel Programming
QBETS: queue bounds estimation from time series
JSSPP'07 Proceedings of the 13th international conference on Job scheduling strategies for parallel processing
Availability Prediction Based Replication Strategies for Grid Environments
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Predicting the Quality of Service of a Peer-to-Peer Desktop Grid
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Fast and scalable simulation of volunteer computing systems using SimGrid
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Using Monte Carlo simulation for improving data availability in P2P network
Proceedings of the Fourteenth International Database Engineering & Applications Symposium
Dynamic scheduling for heterogeneous Desktop Grids
Journal of Parallel and Distributed Computing
Performance comparison of erasure codes for different churn models in P2P storage systems
ICIC'10 Proceedings of the Advanced intelligent computing theories and applications, and 6th international conference on Intelligent computing
Risk aware overbooking for commercial grids
JSSPP'10 Proceedings of the 15th international conference on Job scheduling strategies for parallel processing
Job-scheduling via resource availability prediction for volunteer computational grids
International Journal of Grid and Utility Computing
Flexible resource allocation for reliable virtual cluster computing systems
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Job failures in high performance computing systems: A large-scale empirical study
Computers & Mathematics with Applications
HPCS'09 Proceedings of the 23rd international conference on High Performance Computing Systems and Applications
Estimating deadline-miss probabilities of tasks in large distributed systems
GPC'12 Proceedings of the 7th international conference on Advances in Grid and Pervasive Computing
A User-Based Model of Grid Computing Workloads
GRID '12 Proceedings of the 2012 ACM/IEEE 13th International Conference on Grid Computing
Assessing MapReduce for Internet Computing: A Comparison of Hadoop and BitDew-MapReduce
GRID '12 Proceedings of the 2012 ACM/IEEE 13th International Conference on Grid Computing
On the checkpointing strategy in desktop grids
IDCS'12 Proceedings of the 5th international conference on Internet and Distributed Computing Systems
Searching for Translated Plagiarism with the Help of Desktop Grids
Journal of Grid Computing
Autonomous massively multiplayer online game operation on unreliable resources
Proceedings of the International C* Conference on Computer Science and Software Engineering
Resource failures risk assessment modelling in distributed environments
Journal of Systems and Software
Hi-index | 0.00 |
In this paper, we consider the problem of modeling machine availability in enterprise-area and wide-area distributed computing settings. Using availability data gathered from three different environments, we detail the suitability of four potential statistical distributions for each data set: exponential, Pareto, Weibull, and hyperexponential. In each case, we use software we have developed to determine the necessary parameters automatically from each data collection. To gauge suitability, we present both graphical and statistical evaluations of the accuracy with each distribution fits each data set. For all three data sets, we find that a hyperexponential model fits slightly more accurately than a Weibull, but that both are substantially better choices than either an exponential or Pareto. These results indicate that either a hyperexponential or Weibull model effectively represents machine availability in enterprise and Internet computing environments.