A query language for understanding component interactions in production systems

  • Authors:
  • Adam J. Oliner;Alex Aiken

  • Affiliations:
  • Stanford University;Stanford University

  • Venue:
  • Proceedings of the 24th ACM International Conference on Supercomputing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

When something unexpected happens in a large production system, administrators must first perform a search to isolate which components and component interactions are likely to be involved. The system may consist of thousands of interacting subsystems, the logging instrumentation may be noisy or incomplete, and the problem description may be vague, so this search is often the most difficult part of understanding the system's behavior. To facilitate the search process, we present a query language and a method for computing these queries that makes minimal assumptions about the available data. We evaluate our method on nearly 1.22 billion lines of system logs from four supercomputers, two autonomous vehicles, and a server cluster.