SCUD: a fast single-pass L1 cache simulation approach for embedded processors with round-robin replacement policy

  • Authors:
  • Mohammad Shihabul Haque;Jorgen Peddersen;Andhi Janapsatya;Sri Parameswaran

  • Affiliations:
  • University of New South Wales, Sydney, Australia;University of New South Wales, Sydney, Australia;University of New South Wales, Sydney, Australia;University of New South Wales, Sydney, Australia

  • Venue:
  • Proceedings of the 47th Design Automation Conference
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Embedded systems designers are free to choose the most suitable configuration of L1 cache in modern processor based SoCs. Choosing the appropriate L1 cache configuration necessitates the simulation of long memory access traces to accurately obtain hit/miss rates. The long execution time taken to simulate these traces, particularly separate simulation for each configuration is a major drawback. Researchers have proposed techniques to speed up the simulation of caches with LRU replacement policy. These techniques are of little use in the majority of embedded processors as these processors utilize Round-robin policy based caches. In this paper we propose a fast L1 cache simulation approach, called SCUD (Sorted Collection of Unique Data), for caches with the Round-robin policy. SCUD is a single-pass cache simulator that can simulate multiple L1 cache configurations (with varying set sizes and associativities) by reading the application trace once. Utilizing fast binary searches in a novel data structure, SCUD simulates an application trace significantly faster than a widely used single configuration cache simulator (Dinero IV). We show SCUD can simulate a set of cache configurations up to 57 times faster than Dinero IV. SCUD shows an average speed up of 19.34 times over Dinero IV for Mediabench applications, and an average speed up of over 10 times for SPEC CPU2000 applications.