Program optimization for instruction caches
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Profile guided code positioning
PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Two-level adaptive training branch prediction
MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Reducing branch costs via branch alignment
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Near-optimal intraprocedural branch alignment
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Trading conflict and capacity aliasing in conditional branch predictors
Proceedings of the 24th annual international symposium on Computer architecture
Procedure placement using temporal-ordering information
ACM Transactions on Programming Languages and Systems (TOPLAS)
Improving locality by critical working sets
Communications of the ACM
An efficient profile-analysis framework for data-layout optimizations
POPL '02 Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
A study of branch prediction strategies
ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
Microbenchmarks for determining branch predictor organization
Software—Practice & Experience - Research Articles
Code placement for improving dynamic branch prediction accuracy
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Pin: building customized program analysis tools with dynamic instrumentation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Power prediction for intel XScale® processors using performance monitoring unit events
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Performance Profiling and Analysis of DoD Applications Using PAPI and TAU
DOD_UGC '05 Proceedings of the 2005 Users Group Conference on 2005 Users Group Conference
The Camino Compiler infrastructure
ACM SIGARCH Computer Architecture News - Special issue on the 2005 workshop on binary instrumentation and application
Producing wrong data without doing anything obviously wrong!
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Program restructuring for virtual memory
IBM Systems Journal
Hi-index | 0.00 |
Modern microprocessors have many microarchitectural features. Quantifying the performance impact of one feature such as dynamic branch prediction can be difficult. On one hand, a timing simulator can predict the difference in performance given two different implementations of the technique, but simulators can be quite inaccurate. On the other hand, real systems are very accurate representations of themselves, but often cannot be modified to study the impact of a new technique. We demonstrate how to develop a performance model for branch prediction using real systems based on object code reordering. By observing the behavior of the benchmarks over a range of branch prediction accuracies, we can estimate the impact of a new branch predictor by simulating only the predictor and not the rest of the microarchitecture. We also use the reordered object code to validate a reverse-engineered model for the Intel Core 2 branch predictor. We simulate several branch predictors using Pin and measure which hypothetical branch predictor has the highest correlation with the real one. This study in object code reorder points to way to future work on estimating the impact of other structures such as the instruction cache, the second-level cache, instruction decoders, indirect branch prediction, etc.