Speculative multithreaded processors
ICS '98 Proceedings of the 12th international conference on Supercomputing
Automatic loop transformations and parallelization for Java
Proceedings of the 14th international conference on Supercomputing
Architectural support for scalable speculative parallelization in shared-memory multiprocessors
Proceedings of the 27th annual international symposium on Computer architecture
Techniques for speculative run-time parallelization of loops
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Machine Learning
Online feedback-directed optimization of Java
OOPSLA '02 Proceedings of the 17th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Adaptive Optimizing Compilers for the 21st Century
The Journal of Supercomputing
Automatic Detection of Parallelism: A Grand Challenge for High-Performance Computing
IEEE Parallel & Distributed Technology: Systems & Technology
JESSICA2: A Distributed Java Virtual Machine with Transparent Thread Migration Support
CLUSTER '02 Proceedings of the IEEE International Conference on Cluster Computing
The Jrpm system for dynamically parallelizing Java programs
Proceedings of the 30th annual international symposium on Computer architecture
Sourcebook of parallel computing
Sourcebook of parallel computing
Adaptive java optimisation using instance-based learning
Proceedings of the 18th annual international conference on Supercomputing
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Loop Parallelisation for the Jikes RVM
PDCAT '05 Proceedings of the Sixth International Conference on Parallel and Distributed Computing Applications and Technologies
Using Machine Learning to Focus Iterative Optimization
Proceedings of the International Symposium on Code Generation and Optimization
Method-specific dynamic compilation using logistic regression
Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
Automatic performance model construction for the fast software exploration of new hardware designs
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Mapping parallelism to multi-cores: a machine learning based approach
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
A workload-aware mapping approach for data-parallel programs
Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
A thread partitioning approach for speculative multithreading
The Journal of Supercomputing
Hi-index | 0.00 |
Parallelism is one of the main sources for performance improvement in modern computing environment, but the efficient exploitation of the available parallelism depends on a number of parameters. Determining the optimum number of threads for a given data parallel loop, for example, is a difficult problem and dependent on the specific parallel platform. This paper presents a learning-based approach to parallel workload allocation in a cost-aware manner. This approach uses static program features to classify programs, before deciding the best workload allocation scheme based on its prior experience with similar programs. Experimental results on 12 Java benchmarks (76 test cases with different workloads in total) show that it can efficiently allocate the parallel workload among Java threads and achieve an efficiency of 86% on average.