A Case for Direct-Mapped Caches
Computer
MemorIES3: a programmable, real-time hardware emulation tool for multiprocessor server design
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Automatically characterizing large scale program behavior
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Reconfigurable Address Collector and Flying Cache Simulator
HPC-ASIA '97 Proceedings of the High-Performance Computing on the Information Superhighway, HPC-Asia '97
SBAC-PAD '02 Proceedings of the 14th Symposium on Computer Architecture and High Performance Computing
Hi-index | 0.00 |
This paper presents the active cache emulator (ACE), a novel field-programmable gate-array (FPGA)-based emulator that models an L3 cache actively and in real-time. ACE leverages interactions with its host system to model the target system. Unlike most existing FPGA-based cache emulators that collect only memory traces from their host system, ACE provides feedback to its host by injecting delays to time dilate the host system such that it experiences hit/miss latencies of the emulated cache. Such active emulation expands the context of performance evaluations by allowing measurements of system performance metrics (e.g., CPI, operations per second, frame rate) in addition to the typical cache-specific performance metrics (e.g., miss ratio) provided by existing emulators. ACE is designed to interface with a front-side bus (FSB) of a typical Pentium-based PC system. ACE utilizes the FSB snoop stall mechanism to inject delays into the system. At present, ACE is implemented using a Xilinx XC2V6000 FPGA running at 66 MHz, the same speed as its host's FSB. Verification of ACE includes using the cache calibrator and RightMark memory analyzer software to confirm proper detection of the emulated cache by the host system, and comparing ACE results with SimpleScalar software simulations. Finally, ACE is used to study L3 caches for compute-intensive, throughput-oriented, and real-time gaming benchmarks (SPEC-CPU2000, SPEC-JBB2000, Quake3). The study shows that analyzing only cache-specific metrics, as done by existing L3 cache studies with FPGA emulators, is insufficient. Active emulation mitigates this issue by providing a broader performance view, allowing researchers make better research conclusion.