Reality-based optimization

Authors:
Scott McFarling
Affiliations:
Microsoft Research, Redmond, WA
Venue:
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Year:
2003

Citing 19
Cited 5

Global register allocation at link time

SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
Reducing the cost of branches

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Program optimization for instruction caches

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Achieving high instruction cache performance with an optimizing compiler

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Profile guided code positioning

PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Procedure merging with instruction caches

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Predicting conditional branch directions from previous runs of a program

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
ATOM: a system for building customized program analysis tools

PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
System support for automatic profiling and optimization

Proceedings of the sixteenth ACM symposium on Operating systems principles
Optimizing alpha executables on Windows NT with spike

Digital Technical Journal
Optimal Sequential Partitions of Graphs

Journal of the ACM (JACM)
Dynamo: a transparent dynamic optimization system

PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Improving locality by critical working sets

Communications of the ACM
Code layout optimizations for transaction processing workloads

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
SPEC CPU2000: Measuring CPU Performance in the New Millennium

Computer
Optimizing instruction cache performance for operating system intensive workloads

HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Progressive profiling: a methodology based on profile propagation and selective profile collection

Progressive profiling: a methodology based on profile propagation and selective profile collection
Profile-directed restructuring of operating system code

IBM Systems Journal
Evaluating the importance of user-specific profiling

WINSYM'98 Proceedings of the 2nd conference on USENIX Windows NT Symposium - Volume 2

Using trace analysis for improving performance in COTS systems

CASCON '04 Proceedings of the 2004 conference of the Centre for Advanced Studies on Collaborative research
Performance of Runtime Optimization on BLAST

Proceedings of the international symposium on Code generation and optimization
TRICK: tracking and reusing compiler's knowledge

ACM SIGPLAN Notices
Evaluating the correspondence between training and reference workloads in SPEC CPU2006

ACM SIGARCH Computer Architecture News
Global critical path: a tool for system-level timing analysis

Proceedings of the 44th annual Design Automation Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

Profile-based optimization has been studied extensively. Numerous papers and real systems have shown substantial improvements. However, most of these papers have been limited to either branch prediction or instruction cache performance. Also, most of these papers have looked at small applications with a limited number of testing and training scenarios.In this paper, we look at real use of large real-world desktop applications. We also assume memory consumption and disk performance are the primary metrics of interest. For this domain, we show that it is very difficult to get adequate coverage of large applications even with an extensive collection of training scenarios. We propose instead to augment traditional scenarios with data derived from real use. We show that this methodology allows us to reduce memory pressure by 29% and disk reads by 33% compared to traditional approaches.