A highly configurable cache architecture for embedded systems
Proceedings of the 30th annual international symposium on Computer architecture
Automatic generation of breakpoint hardware for silicon debug
Proceedings of the 41st annual Design Automation Conference
Adaptive Cache Compression for High-Performance Processors
Proceedings of the 31st annual international symposium on Computer architecture
A way-halting cache for low-energy high-performance systems
ACM Transactions on Architecture and Code Optimization (TACO)
Compressing Functional Tests for Microprocessors
ATS '05 Proceedings of the 14th Asian Test Symposium on Asian Test Symposium
The good, the bad, and the ugly of silicon debug
Proceedings of the 43rd annual Design Automation Conference
Interactive presentation: Low cost debug architecture using lossy compression for silicon debug
Proceedings of the conference on Design, automation and test in Europe
CacheCompress: a novel approach for test data compression with cache for IP embedded cores
Proceedings of the 2007 IEEE/ACM international conference on Computer-aided design
IFRA: instruction footprint recording and analysis for post-silicon bug localization in processors
Proceedings of the 45th annual Design Automation Conference
Cache aware compression for processor debug support
Proceedings of the Conference on Design, Automation and Test in Europe
Space sensitive cache dumping for post-silicon validation
Proceedings of the Conference on Design, Automation and Test in Europe
Hi-index | 0.00 |
Post-silicon processor debugging is frequently carried out in a loop consisting of several iterations of the following two key steps: (i) processor execution for some duration, followed by (ii) dumping out of the processor's internal state into an external logic analyzer for further offline processing. Internal state of the processor is dominated by the L2 cache. During the process of dumping the cache content, the processor's execution is halted so that the state can be faithfully reproduced offline. In order to reduce the duration for which the processor is halted, and indirectly reduce debug time, we propose two Online Cache Dumping strategies, Retransmit Non-dumped Line (RNL) and Dump History Table (DHT), with the objective of transferring the cache contents while the processor is executing, and yet maintaining fidelity of the dumped data. For typical experimental debug scenarios, we observe that the effective dump times are reduced to between 0.01% and 3.5% of the original times. We also employ compression to reduce the cache content transfer time and logic analyzer space. Our experiments indicate an average compression ratio of 59.2%.