Shared memory multiprocessor support for functional array processing in SAC
Journal of Functional Programming
SPM Conscious Loop Scheduling for Embedded Chip Multiprocessors
ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 1
Memory bank aware dynamic loop scheduling
Proceedings of the conference on Design, automation and test in Europe
Dynamic partitioning of loop iterations on heterogeneous PC clusters
The Journal of Supercomputing
Implementation of a Performance-Based Loop Scheduling on Heterogeneous Clusters
ICA3PP '09 Proceedings of the 9th International Conference on Algorithms and Architectures for Parallel Processing
A directive-based MPI code generator for Linux PC clusters
The Journal of Supercomputing
Does cache sharing on modern CMP matter to the performance of contemporary multithreaded programs?
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
A parallel loop self-scheduling on extremely heterogeneous PC clusters
ICCS'03 Proceedings of the 2003 international conference on Computational science
Performance-based workload distribution on grid environments
GPC'07 Proceedings of the 2nd international conference on Advances in grid and pervasive computing
Performance-based loop scheduling on grid environments
ISHPC'05/ALPS'06 Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systems
Cache topology aware computation mapping for multicores
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Computer Methods and Programs in Biomedicine
Memory system performance in a NUMA multicore multiprocessor
Proceedings of the 4th Annual International Conference on Systems and Storage
Proceedings of the international symposium on Memory management
A performance-based parallel loop self-scheduling on grid computing environments
NPC'05 Proceedings of the 2005 IFIP international conference on Network and Parallel Computing
A hybrid parallel loop scheduling scheme on grid environments
GCC'05 Proceedings of the 4th international conference on Grid and Cooperative Computing
An adaptive job allocation strategy for heterogeneous multi-cluster systems
GPC'10 Proceedings of the 5th international conference on Advances in Grid and Pervasive Computing
GPC'06 Proceedings of the First international conference on Advances in Grid and Pervasive Computing
Matching memory access patterns and data placement for NUMA systems
Proceedings of the Tenth International Symposium on Code Generation and Optimization
SnuCL: an OpenCL framework for heterogeneous CPU/GPU clusters
Proceedings of the 26th ACM international conference on Supercomputing
Using analytical models to load balancing in a heterogeneous network of computers
PaCT'07 Proceedings of the 9th international conference on Parallel Computing Technologies
IA^3 '13 Proceedings of the 3rd Workshop on Irregular Applications: Architectures and Algorithms
Hi-index | 0.00 |
An improtant issue in the parallel execution of loops is how to partition and schedule the loops onto the available processors. While most existing dynamic scheduling algorithms manage to load imbalance well, they fail to take locality into account and therefore perform poorly on parallel systems with non-uniform memory access times.