Modeling Communication Overhead: MPI and MPL Performance on the IBM SP2

Authors:
Zhiwei Xu;Kai Hwang
Affiliations:
-;-
Venue:
IEEE Parallel & Distributed Technology: Systems & Technology
Year:
1996

Citing 10
Cited 38

LogP: towards a realistic model of parallel computation

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
The communication challenge for MPP: Intel Paragon and Meiko CS-2

Parallel Computing
CCL: A Portable and Tunable Collective Communication Library for Scalable Parallel Computers

IEEE Transactions on Parallel and Distributed Systems
PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing

PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing
SP2 system architecture

IBM Systems Journal
The SP2 high-performance switch

IBM Systems Journal
LogGP: incorporating long messages into the LogP model—one step closer towards a realistic model for parallel computation

Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Benchmark Evaluation of the IBM SP2 for Parallel Signal Processing

IEEE Transactions on Parallel and Distributed Systems
Advanced Computer Architecture: Parallelism,Scalability,Programmability

Advanced Computer Architecture: Parallelism,Scalability,Programmability
Reducing the variance of point to point transfers in the IBM 9076 parallel computer

Proceedings of the 1994 ACM/IEEE conference on Supercomputing

Benchmark Evaluation of the IBM SP2 for Parallel Signal Processing

IEEE Transactions on Parallel and Distributed Systems
Optimizing communication in time-warp simulators

PADS '98 Proceedings of the twelfth workshop on Parallel and distributed simulation
An Analytical Method for Predicting the Performance of Parallel Image Processing Operations

The Journal of Supercomputing
Resource Scaling Effects on MPP Performance: The STAP Benchmark Implications

IEEE Transactions on Parallel and Distributed Systems
Parallel Implementations of Block-Based Motion Vector Estimation for Video Compression on Four Parallel Processing Systems

International Journal of Parallel Programming
MPI and Java-MPI: contrasts and comparisons of low-level communication performance

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Design and Performance Evaluation of a Portable Parallel Library for Space-Time Adaptive Processing

IEEE Transactions on Parallel and Distributed Systems
MCGS: A Modified Conjugate Gradient Squared Algorithm for Nonsymmetric Linear Systems

The Journal of Supercomputing
Using regression splines for software performance analysis

Proceedings of the 2nd international workshop on Software and performance
Data Locality Exploitation in the Decomposition of Regular Domain Problems

IEEE Transactions on Parallel and Distributed Systems
An Efficient Adaptive Scheduling Scheme for Distributed Memory Multicomputers

IEEE Transactions on Parallel and Distributed Systems
Parallel Approaches for Singular Value Decomposition as Applied to Robotic Manipulator Jacobians

International Journal of Parallel Programming
Assessing the Performance of the New IBM SP2 Communication Subsystem

IEEE Parallel & Distributed Technology: Systems & Technology
CASCH: A Tool for Computer-Aided Scheduling

IEEE Concurrency
Parallel approaches for singular value decomposition as applied to robotic manipulator Jacobians

International Journal of Parallel Programming
Modeling the Communication Performance of the IBM SP2

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Empirical Evaluation of Distributed Mutual Exclusion Algorithms

IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Performance Evaluation and Modeling of the Fujitsu AP3000 Message-Passing Libraries

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Resource Function Capture for Performance Aspects of Software Components and Sub-Systems

Performance Engineering, State of the Art and Current Trends
Modeling MPI Collective Communications on the AP3000 Multicomputer

Proceedings of the 6th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
P-AutoClass: Scalable Parallel Clustering for Mining Large Data Sets

IEEE Transactions on Knowledge and Data Engineering
A detailed MPI communication model for distributed systems

Future Generation Computer Systems
PEMPIs: a new methodology of modeling and prediction of MPI programs performance

International Journal of Parallel Programming
Modelling asynchronous message passing in small cluster environments

International Journal of Computers and Applications
PELCR: Parallel environment for optimal lambda-calculus reduction

ACM Transactions on Computational Logic (TOCL)
Computation-efficient parallel prefix

AIC'06 Proceedings of the 6th WSEAS International Conference on Applied Informatics and Communications
Two families of parallel prefix algorithms for multicomputers

TELE-INFO'08 Proceedings of the 7th WSEAS International Conference on Telecommunications and Informatics
Realistic Performance Prediction Tool for the Parallel Block LU Factorization Algorithm

Informatica
Multiphase Data Exchange in Distributed Logic-Algebraic Based Processing

IEA/AIE '08 Proceedings of the 21st international conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems: New Frontiers in Applied Artificial Intelligence
Parallel prefix algorithms on the multicomputer

WSEAS Transactions on Computer Research
New parallel prefix algorithms

AIC'09 Proceedings of the 9th WSEAS international conference on Applied informatics and communications
A detailed MPI communication model for distributed systems

Future Generation Computer Systems
New families of computation-efficient parallel prefix algorithms

WSEAS Transactions on Computers
An improved model for predicting HPL performance

GPC'07 Proceedings of the 2nd international conference on Advances in grid and pervasive computing
Toward performance models of MPI implementations for understanding application scaling issues

EuroMPI'10 Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface
Statistical Power and Performance Modeling for Optimizing the Energy Efficiency of Scientific Computing

GREENCOM-CPSCOM '10 Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social Computing
Modeling message-passing overhead on NCHC formosa PC cluster

GPC'06 Proceedings of the First international conference on Advances in Grid and Pervasive Computing
Modeling energy consumption for master---slave applications

The Journal of Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Through timing experiments on the IBM SP2, an overhead quantifying method is developed to evaluate collective communication performance on any message-passing multicomputer. MPP designers and users can apply this method to reveal architectural bottlenecks and to trade off between computations and communications for parallel applications optimization.