High-level optimization via automated statistical modeling

Authors:
Eric A. Brewer
Affiliations:
University of California at Berkeley
Venue:
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Year:
1995

Citing 18
Cited 38

A logarithmic time sort for linear size networks

Journal of the ACM (JACM)
Solving problems on concurrent processors. Vol. 1: General techniques and regular problems

Solving problems on concurrent processors. Vol. 1: General techniques and regular problems
An overview for the PTRAN analysis system for multiprocessing

Journal of Parallel and Distributed Computing - Special Issue on Languages, Compilers and environments for Parallel Programming
Congestion avoidance and control

SIGCOMM '88 Symposium proceedings on Communications architectures and protocols
The definition of Standard ML

The definition of Standard ML
Systems programming with Modula-3

Systems programming with Modula-3
Active messages: a mechanism for integrated communication and computation

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The network architecture of the Connection Machine CM-5 (extended abstract)

SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
Profile-driven compilation

Profile-driven compilation
Model-driven mapping of computation onto distributed memory parallel computers

Model-driven mapping of computation onto distributed memory parallel computers
Model-driven mapping onto distributed memory parallel computers

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
LogP: towards a realistic model of parallel computation

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Programming models for irregular applications

ACM SIGPLAN Notices - Workshop on languages, compilers and run-time environments for distributed memory multiprocessors
Anatomy of a message in the Alewife multiprocessor

ICS '93 Proceedings of the 7th international conference on Supercomputing
Portable high-performance supercomputing: high-level platform-dependent optimization

Portable high-performance supercomputing: high-level platform-dependent optimization
Parallel performance prediction using lost cycles analysis

Proceedings of the 1994 ACM/IEEE conference on Supercomputing
How to Get Good Performance from the CM-5 Data Network

Proceedings of the 8th International Symposium on Parallel Processing
PRELUDE: A SYSTEM FOR PORTABLE PARALL

PRELUDE: A SYSTEM FOR PORTABLE PARALL

Adapting to network and client variability via on-demand dynamic distillation

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Programming language requirements for the next millennium

ACM Computing Surveys (CSUR) - Special issue: position statements on strategic directions in computing research
Dynamic feedback: an effective technique for adaptive computing

Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Eliminating synchronization overhead in automatically parallelized programs using dynamic feedback

ACM Transactions on Computer Systems (TOCS)
ILP versus TLP on SMT

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Application-level scheduling on distributed heterogeneous networks

Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Stochastic search for signal processing algorithm optimization

Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Performance Modeling and Composition: A Case Study in Cell Simulation

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
A Performance Estimator for Parallel Programs

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Performance Prediction with Benchmaps

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Scheduling From the Perspective of the Application

HPDC '96 Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
Predicting the Running Times of Parallel Programs by Simulation

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Learning to construct fast signal processing implementations

The Journal of Machine Learning Research
Parallel program performance prediction using deterministic task graph analysis

ACM Transactions on Computer Systems (TOCS)
Architecture of an automatically tuned linear algebra library

Parallel Computing
A framework for adaptive algorithm selection in STAPL

Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Ligature: Component Architecture for High Performance Applications

International Journal of High Performance Computing Applications
Statistical Models for Empirical Search-Based Performance Tuning

International Journal of High Performance Computing Applications
SmartApps: middle-ware for adaptive applications on reconfigurable platforms

ACM SIGOPS Operating Systems Review
An Adaptive Algorithm Selection Framework for Reduction Parallelization

IEEE Transactions on Parallel and Distributed Systems
Measuring empirical computational complexity

Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Automatic Communication Performance Debugging in PGAS Languages

Languages and Compilers for Parallel Computing
Empirical hardness models: Methodology and a case study on combinatorial auctions

Journal of the ACM (JACM)
PetaBricks: a language and compiler for algorithmic choice

Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Performance modeling for dynamic algorithm selection

ICCS'03 Proceedings of the 2003 international conference on Computational science
Self-adapting numerical software and automatic tuning of heuristics

ICCS'03 Proceedings of the 2003 international conference on Computational science
Self-adapting numerical software and automatic tuning of heuristics

ICCS'03 Proceedings of the 2003 international conference on Computational science
Practical performance models of algorithms in evolutionary program induction and other domains

Artificial Intelligence
A case for machine learning to optimize multicore performance

HotPar'09 Proceedings of the First USENIX conference on Hot topics in parallelism
Comparing machine learning approaches for context-aware composition

SC'11 Proceedings of the 10th international conference on Software composition
Optimizing matrix multiplication with a classifier learning system

LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Language and compiler support for auto-tuning variable-accuracy algorithms

CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
Optimized composition of performance-aware parallel components

Concurrency and Computation: Practice & Experience
An automated approach to generating efficient constraint solvers

Proceedings of the 34th International Conference on Software Engineering
Adaptation of legacy codes to context-aware composition using aspect-oriented programming

SC'12 Proceedings of the 11th international conference on Software Composition
Mantis: automatic performance prediction for smartphone applications

USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference
Models of performance of evolutionary program induction algorithms based on indicators of problem difficulty

Evolutionary Computation
Algorithm runtime prediction: Methods & evaluation

Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

We develop the use of statistical modeling for portable high-level optimizations such as data layout and algorithm selection. We build the models automatically from profiling information, which ensures robust and accurate models that reflect all aspects of the target platform.We use the models to select among several data layouts for an iterative PDE solver and to select among several sorting algorithms. The selection is correct more than 99% of the time on each of four platforms. In the few cases it selects suboptimally, the selected implementation performs nearly as well; that is, it always makes at least a very good choice. Correct selection is platform and workload dependent and can improve performance by nearly a factor of three.We also use the models to optimize parameters of these applications automatically. In all cases, the models predicted the optimal parameter setting, resulting in improvements ranging up to a factor of three.Finally, we use the models to construct portable high-level libraries, which contain multiple implementations and support for automatic selection and parameter optimization of the fastest implementation for the target platform and workload.