Compile-Time Based Performance Prediction

Authors:
Calin Cascaval;Luiz De Rose;David A. Padua;Daniel A. Reed
Affiliations:
-;-;-;-
Venue:
LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
Year:
1999

Citing 20
Cited 10

A VLIW architecture for a trace scheduling compiler

ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
Efficient (stack) algorithms for analysis of write-back and sector memories

ACM Transactions on Computer Systems (TOCS)
Determining average program execution times and their variance

PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Evaluating Associativity in CPU Caches

IEEE Transactions on Computers
Machine Characterization Based on an Abstract High-Level Language Machine

IEEE Transactions on Computers
Efficient trace-driven simulation methods for cache performance analysis

ACM Transactions on Computer Systems (TOCS)
Using profile information to assist classic code optimizations

Software—Practice & Experience
Branch prediction for free

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Set-associative cache simulation using generalized binomial trees

ACM Transactions on Computer Systems (TOCS)
Precise miss analysis for program transformations with caches of arbitrary associativity

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Dependence Analysis

Dependence Analysis
Automatic Performance Prediction of Parallel Programs

Automatic Performance Prediction of Parallel Programs
Parallel Programming with Polaris

Computer
Cache Performance of the SPEC92 Benchmark Suite

IEEE Micro
Measuring Cache and TLB Performance and Their Effect on Benchmark Runtimes

IEEE Transactions on Computers
SvPablo: A Multi-language Performance Analysis System

TOOLS '98 Proceedings of the 10th International Conference on Computer Performance Evaluation: Modelling Techniques and Tools
FlexRAM: Toward an Advanced Intelligent Memory System

ICCD '99 Proceedings of the 1999 IEEE International Conference on Computer Design
Analysis of Benchmark Characteristics and Benchmark Performance

Analysis of Benchmark Characteristics and Benchmark Performance
Trace Scheduling: A Technique for Global Microcode Compaction

IEEE Transactions on Computers
Evaluation techniques for storage hierarchies

IBM Systems Journal

Adaptively Mapping Code in an Intelligent Memory Architecture

IMS '00 Revised Papers from the Second International Workshop on Intelligent Memory Systems
Predicting Cache Space Contention in Utility Computing Servers

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 10 - Volume 11
Understanding the behavior and implications of context switch misses

ACM Transactions on Architecture and Code Optimization (TACO)
A framework for an automatic hybrid MPI+OpenMP code generation

Proceedings of the 19th High Performance Computing Symposia
Adaptively increasing performance and scalability of automatically parallelized programs

LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
PSnAP: accurate synthetic address streams through memory profiles

LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
Validating model-driven performance predictions on random software systems

QoSA'10 Proceedings of the 6th international conference on Quality of Software Architectures: research into Practice - Reality and Gaps
Resource optimization in distributed real-time multimedia applications

Multimedia Tools and Applications
Survey of scheduling techniques for addressing shared resources in multicore processors

ACM Computing Surveys (CSUR)
Active memory controller

The Journal of Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we present results we obtained using a compiler to predict performance of scientific codes. The compiler, Polaris [3], is both the primary tool for estimating the performance of a range of codes, and the beneficiary of the results obtained from predicting the program behavior at compile time. We show that a simple compile-time model, augmented with profiling data obtained using very light instrumentation, can be accurate within 20% (on average) of the measured performance for codes using both dense and sparse computational methods.