Improving the accuracy of dynamic branch prediction using branch correlation
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Analysis of Cache-Related Preemption Delay in Fixed-Priority Preemptive Scheduling
IEEE Transactions on Computers
Guest Editorial: A Review of Worst-Case Execution-TimeAnalysis
Real-Time Systems - Special issue on worst-case execution-time analysis
WCET Analysis of Superscalar Processors Using SimulationWith Coloured Petri Nets
Real-Time Systems - Special issue on worst-case execution-time analysis
Effective Analysis for Engineering Real-Time Fixed Priority Schedulers
IEEE Transactions on Software Engineering
A study of branch prediction strategies
ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
Pipeline Timing Analysis Using a Trace-Driven Simulator
RTCSA '99 Proceedings of the Sixth International Conference on Real-Time Computing Systems and Applications
Accurate estimation of cache-related preemption delay
Proceedings of the 1st IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Timing Analysis for Preemptive Multi-Tasking Real-Time Systems with Caches
Proceedings of the conference on Design, automation and test in Europe - Volume 2
An Efficient Search Algorithm of Worst-Case Cache Flush Timings
RTCSA '05 Proceedings of the 11th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications
Hi-index | 0.00 |
This paper proposes an efficient method to analyze worst case interruption delay (WCID) of a workload running on modern microprocessors using a cycle accurate simulator (CAS). Our method is highly accurate because it simulates all possible cases inserting an interruption just before the retirement of every instruction executed in a workload. It is also (reasonably) efficient because it takes O(N log N) time for a workload with N executed instructions, instead of O(N2) of a straightforward iterative simulation of interrupted executions. The key idea for the efficiency is that a pair of executions with different interruption points has a set of durations in which they behave exactly coherent and thus one of simulations for the durations may be omitted. We implemented this method modifying the SimpleScalar tool set to prove it finds out WCID of workloads with five million executed instructions in reasonable time, less than 30 minutes, which would be 200-300 days by the straightforward method. We also show a parallelization of our method achieves a good speedup, about 7-fold with 8-node PC cluster.