Nearest-neighbor mapping of finite element graphs onto processor meshes

Authors:
P. Sadayappan;F. Ercal
Affiliations:
The Ohio State Univ., Columbus, OH;The Ohio State Univ., Columbus, OH
Venue:
IEEE Transactions on Computers
Year:
1987

Citing 11
Cited 25

A Partitioning Strategy for Nonuniform Problems on Multiprocessors

IEEE Transactions on Computers
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Iterative algorithms for large sparse linear systems on parallel computers

Iterative algorithms for large sparse linear systems on parallel computers
A Task Allocation Model for Distributed Computing Systems

IEEE Transactions on Computers
On the Mapping Problem

IEEE Transactions on Computers
Models for Dynamic Load Balancing in a Heterogeneous Multiple Processor System

IEEE Transactions on Computers
Practical Multiprocessor Scheduling Algorithms for Efficient Parallel Processing

IEEE Transactions on Computers
Multiprocessor Scheduling with the Aid of Network Flow Algorithms

IEEE Transactions on Software Engineering
Optimal Load Balancing in a Multiple Processor System with Many Job Classes

IEEE Transactions on Software Engineering
Heuristic Models of Task Assignment Scheduling in Distributed Systems

Computer
Task Allocation in Distributed Data Processing

Computer

Iterative Algorithms for Solution of Large Sparse Systems of Linear Equations on Hypercubes

IEEE Transactions on Computers
Task allocation onto a hypercube by recursive mincut bipartitioning

C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
Topologies' - computational messaging for multicomputers

C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
Implementation of the conjugate gradient algorithm on a vector hypercube multiprocessor

C3P Proceedings of the third conference on Hypercube concurrent computers and applications - Volume 2
Interprocessor communication speed and performance in distributed-memory parallel processors

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
K9: a simulator of distributed-memory parallel processors

Proceedings of the 1989 ACM/IEEE conference on Supercomputing
“Topologies”—distributed objects on multicomputers

ACM Transactions on Computer Systems (TOCS)
Embedding Rectangular Grids into Square Grids with Dilation Two

IEEE Transactions on Computers
Geometry based mapping strategies for PDE computations

ICS '91 Proceedings of the 5th international conference on Supercomputing
A network-topology independent task allocation strategy for parallel computers

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Improved Algorithms for Mapping Pipelined and Parallel Computations

IEEE Transactions on Computers
Scalability analysis of partitioning strategies for finite element graphs: a summary of results

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Graph contraction for physical optimization methods: a quality-cost tradeoff for mapping data on parallel computers

ICS '93 Proceedings of the 7th international conference on Supercomputing
Graph contraction for mapping data on parallel computers: a quality-cost tradeoff

Scientific Programming
An improved cost function for static partitioning of parallel circuit simulations using a conservative synchronization protocol

PADS '95 Proceedings of the ninth workshop on Parallel and distributed simulation
HARP: a fast spectral partitioner

Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
The K2 parallel processor: architecture and hardware implementation

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Partitioning and Mapping Nested Loops on Multiprocessor Systems

IEEE Transactions on Parallel and Distributed Systems
Optimal Processor Assignment for a Class of Pipelined Computations

IEEE Transactions on Parallel and Distributed Systems
Neighbourhood Preserving Load Balancing: A Self-Organizing Approach

Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Geometry-Aided Rectilinear Partitioning of Unstructured Meshes

ParNum '99 Proceedings of the 4th International ACPC Conference Including Special Tracks on Parallel Numerics and Parallel Computing in Image Processing, Video Processing, and Multimedia: Parallel Computation
Clustering and reassignment-based mapping strategy for message-passing architectures

Journal of Systems Architecture: the EUROMICRO Journal
References

Sourcebook of parallel computing
Dynamic topology aware load balancing algorithms for molecular dynamics applications

Proceedings of the 23rd international conference on Supercomputing
Mapping applications with collectives over sub-communicators on torus networks

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis

Quantified Score

Hi-index	14.99

Visualization

Abstract

The processor allocation problem is addressed in the context of the parallelization of a finite element modeling program on a processor mesh. A heuristic two-step, graph-based mapping scheme with polynomial-time complexity is developed: 1) initial generation of a graph partition for nearest-neighbor mapping of the finite element graph onto the processor graph, and, 2) a heuristic boundary refinement procedure to incrementally alter the initial partition for improved load balancing among the processors. The effectiveness of the approach is gaged both by estimation using a model with empirically determined parameters, as well as implementation and experimental measurement on a 16 node hypercube parallel computer.