Computer
Memory storage patterns in parallel processing
Memory storage patterns in parallel processing
Data optimization: allocation of arrays to reduce communication on SIMD machines
Journal of Parallel and Distributed Computing - Massively parallel computation
A data locality optimizing algorithm
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Automatic data mapping for distributed-memory parallel computers
ICS '92 Proceedings of the 6th international conference on Supercomputing
Global optimizations for parallelism and locality on scalable parallel machines
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Communication-free hyperplane partitioning of nested loops
Journal of Parallel and Distributed Computing
Efficient Algorithms for Data Distribution on Distributed Memory Parallel Computers
IEEE Transactions on Parallel and Distributed Systems
A comparative workload-based methodology for performance evaluation of parallel computers
Future Generation Computer Systems
Computers and Operations Research
Deriving Array Distributions by Optimization Techniques
The Journal of Supercomputing
High Performance Compilers for Parallel Computing
High Performance Compilers for Parallel Computing
Compile-Time Techniques for Data Distribution in Distributed Memory Machines
IEEE Transactions on Parallel and Distributed Systems
Communication-Free Data Allocation Techniques for Parallelizing Compilers on Multicomputers
IEEE Transactions on Parallel and Distributed Systems
Automatic Data Layout Using 0-1 Integer Programming
PACT '94 Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques
Performance of parallel programs: model and analyses
Performance of parallel programs: model and analyses
Compiler techniques for optimizing communication and data distribution for distributed-memory multicomputers
Memetic algorithms for parallel code optimization
International Journal of Parallel Programming
A performance study of multiprocessor task scheduling algorithms
The Journal of Supercomputing
Hi-index | 0.00 |
In this study, a global optimization meta-heuristic is developed for the problem of determining the optimum data distribution and degree of parallelism in parallelizing a sequential program for distributed memory machines. The parallel program is considered as the union of consecutive stages and the method deals with all the stages in the entire program rather than proposing solutions for each stage. The meta-heuristic developed here for this specific problem combines simulated annealing and hill climbing (SA-HC) in the search for the optimum configuration. Performance is tested in terms of the total execution time of the program including communication and computation times. Two exemplary codes from the literature, the first being computation intensive and the second being communication intensive, are utilized in the experiments. The performance of the SA-HC algorithm provides satisfactory results for these illustrative examples.