A Partitioning Strategy for Nonuniform Problems on Multiprocessors
IEEE Transactions on Computers
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Iterative algorithms for large sparse linear systems on parallel computers
Iterative algorithms for large sparse linear systems on parallel computers
A Task Allocation Model for Distributed Computing Systems
IEEE Transactions on Computers
IEEE Transactions on Computers
Models for Dynamic Load Balancing in a Heterogeneous Multiple Processor System
IEEE Transactions on Computers
Practical Multiprocessor Scheduling Algorithms for Efficient Parallel Processing
IEEE Transactions on Computers
Multiprocessor Scheduling with the Aid of Network Flow Algorithms
IEEE Transactions on Software Engineering
Optimal Load Balancing in a Multiple Processor System with Many Job Classes
IEEE Transactions on Software Engineering
Iterative Algorithms for Solution of Large Sparse Systems of Linear Equations on Hypercubes
IEEE Transactions on Computers
Task allocation onto a hypercube by recursive mincut bipartitioning
C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
Topologies' - computational messaging for multicomputers
C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
Implementation of the conjugate gradient algorithm on a vector hypercube multiprocessor
C3P Proceedings of the third conference on Hypercube concurrent computers and applications - Volume 2
Interprocessor communication speed and performance in distributed-memory parallel processors
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
K9: a simulator of distributed-memory parallel processors
Proceedings of the 1989 ACM/IEEE conference on Supercomputing
“Topologies”—distributed objects on multicomputers
ACM Transactions on Computer Systems (TOCS)
Embedding Rectangular Grids into Square Grids with Dilation Two
IEEE Transactions on Computers
Geometry based mapping strategies for PDE computations
ICS '91 Proceedings of the 5th international conference on Supercomputing
A network-topology independent task allocation strategy for parallel computers
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Improved Algorithms for Mapping Pipelined and Parallel Computations
IEEE Transactions on Computers
Scalability analysis of partitioning strategies for finite element graphs: a summary of results
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
ICS '93 Proceedings of the 7th international conference on Supercomputing
Graph contraction for mapping data on parallel computers: a quality-cost tradeoff
Scientific Programming
PADS '95 Proceedings of the ninth workshop on Parallel and distributed simulation
HARP: a fast spectral partitioner
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
The K2 parallel processor: architecture and hardware implementation
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Partitioning and Mapping Nested Loops on Multiprocessor Systems
IEEE Transactions on Parallel and Distributed Systems
Optimal Processor Assignment for a Class of Pipelined Computations
IEEE Transactions on Parallel and Distributed Systems
Neighbourhood Preserving Load Balancing: A Self-Organizing Approach
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Geometry-Aided Rectilinear Partitioning of Unstructured Meshes
ParNum '99 Proceedings of the 4th International ACPC Conference Including Special Tracks on Parallel Numerics and Parallel Computing in Image Processing, Video Processing, and Multimedia: Parallel Computation
Clustering and reassignment-based mapping strategy for message-passing architectures
Journal of Systems Architecture: the EUROMICRO Journal
Sourcebook of parallel computing
Dynamic topology aware load balancing algorithms for molecular dynamics applications
Proceedings of the 23rd international conference on Supercomputing
Mapping applications with collectives over sub-communicators on torus networks
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 14.99 |
The processor allocation problem is addressed in the context of the parallelization of a finite element modeling program on a processor mesh. A heuristic two-step, graph-based mapping scheme with polynomial-time complexity is developed: 1) initial generation of a graph partition for nearest-neighbor mapping of the finite element graph onto the processor graph, and, 2) a heuristic boundary refinement procedure to incrementally alter the initial partition for improved load balancing among the processors. The effectiveness of the approach is gaged both by estimation using a model with empirically determined parameters, as well as implementation and experimental measurement on a 16 node hypercube parallel computer.