Internet service performance failure detection

  • Authors:
  • Amy Ward;Peter Glynn;Kathy Richardson

  • Affiliations:
  • Engineering Economic Systems & Operations Research Department, Stanford University, Stanford, CA;Engineering Economic Systems & Operations Research Department, Stanford University, Stanford, CA;Western Research Labs, Digital Equipment Corporation, Palo Alto, CA

  • Venue:
  • ACM SIGMETRICS Performance Evaluation Review
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

The increasing complexity of computer networks and our increasing dependence on them means enforcing reliability requirements is both more challenging and more critical. The expansion of network services to include both traditional interconnect services and user-oriented services such as the web and email has guaranteed both the increased complexity of networks and the increased importance of their performance. The first step toward increasing reliability is early detection of network performance failures. Here we consider the applicability of statistical model frameworks under the most general assumptions possible. Using measurements from corporate proxy servers, we test the framework against real world failures. The results of these experiments show we can detect failures, but with some tradeoff questions. The pull is in the warning time: either we miss early warning signs or we report some false warnings. Finally, we offer insight into the problem of failure diagnosis.