BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
A path-based methodology for post-silicon timing validation
Proceedings of the 2004 IEEE/ACM International conference on Computer-aided design
A reconfigurable design-for-debug infrastructure for SoCs
Proceedings of the 43rd annual Design Automation Conference
PIN: a binary instrumentation tool for computer architecture research and education
WCAE '04 Proceedings of the 2004 workshop on Computer architecture education: held in conjunction with the 31st International Symposium on Computer Architecture
Automating post-silicon debugging and repair
Proceedings of the 2007 IEEE/ACM international conference on Computer-aided design
IFRA: instruction footprint recording and analysis for post-silicon bug localization in processors
Proceedings of the 45th annual Design Automation Conference
On automated trigger event generation in post-silicon validation
Proceedings of the conference on Design, automation and test in Europe
Proceedings of the conference on Design, automation and test in Europe
A Design-for-Debug (DfD) for NoC-Based SoC Debugging via NoC
ATS '08 Proceedings of the 2008 17th Asian Test Symposium
Online cache state dumping for processor debug
Proceedings of the 46th Annual Design Automation Conference
BLoG: post-silicon bug localization in processors using bug localization graphs
Proceedings of the 47th Design Automation Conference
Scalable specification mining for verification and diagnosis
Proceedings of the 47th Design Automation Conference
A high-level debug environment for communication-centric debug
Proceedings of the Conference on Design, Automation and Test in Europe
Trace signal selection for visibility enhancement in post-silicon validation
Proceedings of the Conference on Design, Automation and Test in Europe
On signal tracing in post-silicon validation
Proceedings of the 2010 Asia and South Pacific Design Automation Conference
Compressing Cache State for Postsilicon Processor Debug
IEEE Transactions on Computers
On Using Lossy Compression for Repeatable Experiments during Silicon Debug
IEEE Transactions on Computers
Embedded debug architecture for bypassing blocking bugs during post-silicon validation
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Transaction based pre-to-post silicon validation
Proceedings of the 48th Design Automation Conference
Simulation-based signal selection for state restoration in silicon debug
Proceedings of the International Conference on Computer-Aided Design
Post-silicon bug diagnosis with inconsistent executions
Proceedings of the International Conference on Computer-Aided Design
Trace signal selection to enhance timing and logic visibility in post-silicon validation
Proceedings of the International Conference on Computer-Aided Design
Hi-index | 0.00 |
The internal state of complex modern processors often needs to be dumped out frequently during post-silicon validation. Since the last level cache (considered L2 in this paper) holds most of the state, the volume of data dumped and the transfer time are dominated by the L2 cache. The limited bandwidth to transfer data off-chip coupled with the large size of L2 cache results in stalling the processor for long durations when dumping the cache contents off-chip. To alleviate this, we propose to transfer only those cache lines that were updated since the previous dump. Since maintaining a bit-vector with a separate bit to track the status of individual cache lines is expensive, we propose 2 methods: (i) where a bit tracks multiple cache lines and (ii) an Interval Table which stores only the starting and ending addresses of continuous runs of updated cache lines. Both methods require significantly lesser space compared to a bit-vector, and allow the designer to choose the amount of space to allocate for this design-for-debug (DFD) feature. The impact of reducing storage space is that some non-updated cache lines are dumped too. We attempt to minimize such overheads. Further, the Interval Table is independent of the cache size which makes it ideal for large caches. Through experimentation, we also determine the break-even point below which a t-lines/bit bit-vector is beneficial compared to an Interval Table.