Adaptive cache coherency for detecting migratory shared data
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Optimizing dynamically-dispatched calls with run-time type feedback
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Simultaneous multithreading: maximizing on-chip parallelism
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Dynamic feedback: an effective technique for adaptive computing
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Eliminating synchronization bottlenecks in object-based programs using adaptive replication
ICS '99 Proceedings of the 13th international conference on Supercomputing
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Symbiotic jobscheduling for a simultaneous multithreaded processor
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
High-level adaptive program optimization with ADAPT
PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Parallel Programming with Polaris
Computer
Reducing Parallel Overheads Through Dynamic Serialization
IPPS '99/SPDP '99 Proceedings of the 13th International Symposium on Parallel Processing and the 10th Symposium on Parallel and Distributed Processing
Supporting Fine-Grained Synchronization on a Simultaneous Multithreading Processor
HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
Adaptive loop transformations for scientific programs
SPDP '95 Proceedings of the 7th IEEE Symposium on Parallel and Distributeed Processing
ADAPT: Automated De-Coupled Adaptive Program Transformation
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Improving the Effectiveness of Software Prefetching with Adaptive Execution
PACT '96 Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques
Compilation techniques for explicitly parallel programs
Compilation techniques for explicitly parallel programs
IBM Systems Journal
Adaptive execution techniques for SMT multiprocessor architectures
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
IBM Journal of Research and Development
Evaluation techniques for storage hierarchies
IBM Systems Journal
Adaptively increasing performance and scalability of automatically parallelized programs
LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
ICCSA'12 Proceedings of the 12th international conference on Computational Science and Its Applications - Volume Part I
Hi-index | 0.00 |
In simultaneous multithreading(SMT) multiprocessors, using all the available threads (logical processors) to run a parallel loop is not always beneficial due to the interference between threads and parallel execution overhead. To maximize the performance of a parallel loop on an SMT multiprocessor, it is important to find an appropriate number of threads for executing the parallel loop. This article presents adaptive execution techniques that find a proper execution mode for each parallel loop in a conventional loop-level parallel program on SMT multiprocessors. A compiler preprocessor generates code that, based on dynamic feedbacks, automatically determines at run time the optimal number of threads for each parallel loop in the parallel application. We evaluate our technique using a set of standard numerical applications and running them on a real SMT multiprocessor machine with 8 hardware contexts. Our approach is general enough to work well with other SMT multiprocessor or multicore systems.