Trace Factory: Generating Workloads for Trace-Driven Simulation of Shared-Bus Multiprocessors

  • Authors:
  • Roberto Giorgi;Cosimo Antonio Prete;Gianpaolo Prina;Luigi Ricciardi

  • Affiliations:
  • -;-;-;-

  • Venue:
  • IEEE Parallel & Distributed Technology: Systems & Technology
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

A major concern with high-performance general-purpose work-stations is to speed up the execution of commands, uniprocess applications, and multiprocess applications with coarse- to medium-grain parallelism. To that end, a simple extension of a uniprocessor machine such as a shared-bus, shared-memory architecture can be employed. Both kinds of machines generally use the same OS model, and the same application can execute on these machines without recoding. However, an intrinsic limitation of the shared-bus architecture is the low number of processors that can be connected to the shared bus. When this number exceeds a critical value, the system's global performance drops drastically because of bus saturation. When two or more processors store a copy of the same memory block in their respective caches and one of them performs a write operation on a location in that block, a set of bus actions is necessary to guarantee that every subsequent read operation by any processor can get the up-to-date value of the modified location. Typically, researchers use simulation to investigate how to improve the performance of such machines. In particular, trace-driven simulation offers a good trade-off between speed, accuracy, and flexibility. A key point of this methodology is to find traces that both represent typical operating conditions and include all information potentially needed for an accurate simulation of the system. The authors have developed a methodology and a set of tools (called Trace Factory) to generate traces for the performance evaluation of shared-bus, shared-memory multiprocessor systems. Trace Factory is particularly useful for evaluating a multi-processor architecture's performance related to different work-loads and to most of the influencing activities of the operating system. The designer can evaluate and tune architectural solutions for coherence protocol, cache structure, bus, and memory.