Studying microarchitectural structures with object code reordering

Authors:
Shah Mohammad Faizur Rahman;Zhe Wang;Daniel A. Jiménez
Affiliations:
The University of Texas at San Antonio;The University of Texas at San Antonio;The University of Texas at San Antonio
Venue:
Proceedings of the Workshop on Binary Instrumentation and Applications
Year:
2009

Citing 19
Cited 0

Program optimization for instruction caches

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Profile guided code positioning

PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Two-level adaptive training branch prediction

MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Reducing branch costs via branch alignment

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Using hybrid branch predictors to improve branch prediction accuracy in the presence of context switches

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Near-optimal intraprocedural branch alignment

Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Trading conflict and capacity aliasing in conditional branch predictors

Proceedings of the 24th annual international symposium on Computer architecture
Procedure placement using temporal-ordering information

ACM Transactions on Programming Languages and Systems (TOPLAS)
Improving locality by critical working sets

Communications of the ACM
An efficient profile-analysis framework for data-layout optimizations

POPL '02 Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
A study of branch prediction strategies

ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
Microbenchmarks for determining branch predictor organization

Software—Practice & Experience - Research Articles
Code placement for improving dynamic branch prediction accuracy

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Pin: building customized program analysis tools with dynamic instrumentation

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Power prediction for intel XScale® processors using performance monitoring unit events

ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Performance Profiling and Analysis of DoD Applications Using PAPI and TAU

DOD_UGC '05 Proceedings of the 2005 Users Group Conference on 2005 Users Group Conference
The Camino Compiler infrastructure

ACM SIGARCH Computer Architecture News - Special issue on the 2005 workshop on binary instrumentation and application
Producing wrong data without doing anything obviously wrong!

Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Program restructuring for virtual memory

IBM Systems Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Modern microprocessors have many microarchitectural features. Quantifying the performance impact of one feature such as dynamic branch prediction can be difficult. On one hand, a timing simulator can predict the difference in performance given two different implementations of the technique, but simulators can be quite inaccurate. On the other hand, real systems are very accurate representations of themselves, but often cannot be modified to study the impact of a new technique. We demonstrate how to develop a performance model for branch prediction using real systems based on object code reordering. By observing the behavior of the benchmarks over a range of branch prediction accuracies, we can estimate the impact of a new branch predictor by simulating only the predictor and not the rest of the microarchitecture. We also use the reordered object code to validate a reverse-engineered model for the Intel Core 2 branch predictor. We simulate several branch predictors using Pin and measure which hypothetical branch predictor has the highest correlation with the real one. This study in object code reorder points to way to future work on estimating the impact of other structures such as the instruction cache, the second-level cache, instruction decoders, indirect branch prediction, etc.