On-the-fly verification of memory consistency with concurrent relaxed scoreboards

  • Authors:
  • Leandro S. Freitas;Eberle A. Rambo;Luiz C. V. dos Santos

  • Affiliations:
  • Federal University of Santa Catarina, Brazil;Federal University of Santa Catarina, Brazil;Federal University of Santa Catarina, Brazil

  • Venue:
  • Proceedings of the Conference on Design, Automation and Test in Europe
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Parallel programming requires the definition of shared-memory semantics by means of a consistency model, which affects how the parallel hardware is designed. Therefore, verifying the hardware compliance with a consistency model is a relevant problem, whose complexity depends on the observability of memory events. Post-silicon checkers analyze a single sequence of events per core and so do most pre-silicon checkers, although one reported method samples two sequences per core. Besides, most are post-mortem checkers requiring the whole sequence of events to be available prior to verification. On the contrary, this paper describes a novel on-the-fly technique for verifying memory consistency from an executable representation of a multicore system. To increase efficiency without hampering verification guarantees, three points are monitored per core. The sampling points are selected to be largely independent from the core's microarchitecture. The technique relies on concurrent relaxed scoreboards to check for consistency violations in each core. To check for global violations, it employs a linear order of events induced by a given test case. We prove that the technique neither indicates false negatives nor false positives when the test case exposes an error that affects the sampled sequences, making it the first on-the-fly checker with full guarantees. We compare our technique with two post-mortem checkers under 2400 scenarios for platforms with 2 to 8 cores. The results show that our technique is at least 100 times faster than a checker sampling a single sequence per processor and it needs approximately 1/4 to 3/4 of the overall verification effort required by a post-mortem checker sampling two sequences per processor.