Global register allocation at link time
SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
Guided self-scheduling: A practical scheduling scheme for parallel supercomputers
IEEE Transactions on Computers
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Profile guided code positioning
PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
CCG: a prototype coagulating code generator
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
SPLASH: Stanford parallel applications for shared-memory
ACM SIGARCH Computer Architecture News
Profile-guided automatic inline expansion for C programs
Software—Practice & Experience
Adjustable block size coherent caches
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The design and analysis of DASH: a scalable directory-based multiprocessor
The design and analysis of DASH: a scalable directory-based multiprocessor
Heterogeneous parallel programming in Jade
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Global optimizations for parallelism and locality on scalable parallel machines
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Communication optimization and code generation for distributed memory machines
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Improving the performance of runtime parallelization
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Adaptive cache coherency for detecting migratory shared data
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Optimizing dynamically-dispatched calls with run-time type feedback
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Profile-assisted instruction scheduling
International Journal of Parallel Programming
Reactive synchronization algorithms for multiprocessors
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Obtaining sequential efficiency for concurrent object-oriented languages
POPL '95 Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Simple and effective link-time optimization of Modula-3 programs
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
High-level optimization via automated statistical modeling
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Profile-guided receiver class prediction
Proceedings of the tenth annual conference on Object-oriented programming systems, languages, and applications
Automatic data layout for high performance Fortran
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Optimizing ML with run-time code generation
PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Fast, effective dynamic compilation
PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
VCODE: a retargetable, extensible, very fast dynamic code generation system
PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Adapting to network and client variability via on-demand dynamic distillation
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
A cost-comparison approach for adaptive distributed shared memory
ICS '96 Proceedings of the 10th international conference on Supercomputing
Synchronization transformations for parallel computing
Proceedings of the 24th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Reactive NUMA: a design for unifying S-COMA and CC-NUMA
Proceedings of the 24th annual international symposium on Computer architecture
Continuous profiling: where have all the cycles gone?
ACM Transactions on Computer Systems (TOCS)
System support for automatic profiling and optimization
Proceedings of the sixteenth ACM symposium on Operating systems principles
Commutativity analysis: a new analysis technique for parallelizing compilers
ACM Transactions on Programming Languages and Systems (TOPLAS)
Predicated array data-flow analysis for run-time parallelization
ICS '98 Proceedings of the 12th international conference on Supercomputing
Lock coarsening: eliminating lock overhead in automatically parallelized object-based programs
Journal of Parallel and Distributed Computing
Experience with the SETL Optimizer
ACM Transactions on Programming Languages and Systems (TOPLAS)
IEEE Standard for Scalable Coherent Interface, Science: IEEE Std. 1596-1992
IEEE Standard for Scalable Coherent Interface, Science: IEEE Std. 1596-1992
Application-specific protocols for user-level shared memory
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Beyond the Black Box: Open Implementation
IEEE Software
IEEE Transactions on Parallel and Distributed Systems
Gprof: A call graph execution profiler
SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction
Improving the Effectiveness of Software Prefetching with Adaptive Execution
PACT '96 Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques
ACM Transactions on Computer Systems (TOCS)
A case for user-level dynamic page migration
Proceedings of the 14th international conference on Supercomputing
International Journal of Parallel Programming
Eliminating synchronization bottlenecks using adaptive replication
ACM Transactions on Programming Languages and Systems (TOPLAS)
Smartlocks: lock acquisition scheduling for self-aware synchronization
Proceedings of the 7th international conference on Autonomic computing
Hi-index | 0.00 |
This article presents dynamic feedback, a technique that enables computations to adapt dynamically to different execution environments. A compiler that uses dynamic feedback produces several different versions of the same source code; each version uses a different optimization policy. The generated code alternately performs sampling phases and production phases. Each sampling phase measures the overhead of each version in the current environment. Each production phase uses the version with the least overhead in the previous sampling phase. The computation periodically resamples to adjust dynamically to changes in the environment. We have implemented dynamic feedback in the context of a parallelizing compiler for object-based programs. The generated code uses dynamic feedback to automatically choose the best synchronization optimization policy. Our experimental results show that the synchronization optimization policy has a significant impact on the overall performance of the computation, that the best policy varies from program to program, that the compiler is unable to statically choose the best policy, and that dynamic feedback enables the generted code to exhibit performance that is comparable to that of code that has been manually tuned to use the best policy. We have also performed a theoretical analysis which provides, under certain assumptions, a guaranteed optimality bound for dynamic feedback relative to a hypothetical (and unrealizable) optimal algorithm that uses the best policy at every point during the execution.