Understanding Distributed Systems via Execution Trace Data

  • Authors:
  • Affiliations:
  • Venue:
  • IWPC '01 Proceedings of the 9th International Workshop on Program Comprehension
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Abstract: One of the most challenging problems facing today's software engineer is to understand and modify distributed systems. One reason is that in actual use systems frequently behave differently than the designer intended. We describe a three-step method to allow a developer to understand the run-time behavior of a distributed system. First, remote procedure calls are traced using CORBA interceptors. Next, the trace data is parsed to construct RPC call-return sequences, and summary statistics are generated. Finally, a visualization tool is used to study the statistics and look for anomalous behavior. We are testing this method on a large distributed system (more than 600,000 lines of code) during operation at a customer's site. Despite the fact that the system has been in operation for over three years, we are finding system configuration and efficiency problems using the method.