LogP: towards a realistic model of parallel computation
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
The communication challenge for MPP: Intel Paragon and Meiko CS-2
Parallel Computing
CCL: A Portable and Tunable Collective Communication Library for Scalable Parallel Computers
IEEE Transactions on Parallel and Distributed Systems
PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing
PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing
IBM Systems Journal
The SP2 high-performance switch
IBM Systems Journal
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Benchmark Evaluation of the IBM SP2 for Parallel Signal Processing
IEEE Transactions on Parallel and Distributed Systems
Advanced Computer Architecture: Parallelism,Scalability,Programmability
Advanced Computer Architecture: Parallelism,Scalability,Programmability
Reducing the variance of point to point transfers in the IBM 9076 parallel computer
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Benchmark Evaluation of the IBM SP2 for Parallel Signal Processing
IEEE Transactions on Parallel and Distributed Systems
Optimizing communication in time-warp simulators
PADS '98 Proceedings of the twelfth workshop on Parallel and distributed simulation
An Analytical Method for Predicting the Performance of Parallel Image Processing Operations
The Journal of Supercomputing
Resource Scaling Effects on MPP Performance: The STAP Benchmark Implications
IEEE Transactions on Parallel and Distributed Systems
International Journal of Parallel Programming
MPI and Java-MPI: contrasts and comparisons of low-level communication performance
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Design and Performance Evaluation of a Portable Parallel Library for Space-Time Adaptive Processing
IEEE Transactions on Parallel and Distributed Systems
MCGS: A Modified Conjugate Gradient Squared Algorithm for Nonsymmetric Linear Systems
The Journal of Supercomputing
Using regression splines for software performance analysis
Proceedings of the 2nd international workshop on Software and performance
Data Locality Exploitation in the Decomposition of Regular Domain Problems
IEEE Transactions on Parallel and Distributed Systems
An Efficient Adaptive Scheduling Scheme for Distributed Memory Multicomputers
IEEE Transactions on Parallel and Distributed Systems
Parallel Approaches for Singular Value Decomposition as Applied to Robotic Manipulator Jacobians
International Journal of Parallel Programming
Assessing the Performance of the New IBM SP2 Communication Subsystem
IEEE Parallel & Distributed Technology: Systems & Technology
CASCH: A Tool for Computer-Aided Scheduling
IEEE Concurrency
Parallel approaches for singular value decomposition as applied to robotic manipulator Jacobians
International Journal of Parallel Programming
Modeling the Communication Performance of the IBM SP2
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Empirical Evaluation of Distributed Mutual Exclusion Algorithms
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Performance Evaluation and Modeling of the Fujitsu AP3000 Message-Passing Libraries
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Resource Function Capture for Performance Aspects of Software Components and Sub-Systems
Performance Engineering, State of the Art and Current Trends
Modeling MPI Collective Communications on the AP3000 Multicomputer
Proceedings of the 6th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
P-AutoClass: Scalable Parallel Clustering for Mining Large Data Sets
IEEE Transactions on Knowledge and Data Engineering
A detailed MPI communication model for distributed systems
Future Generation Computer Systems
PEMPIs: a new methodology of modeling and prediction of MPI programs performance
International Journal of Parallel Programming
Modelling asynchronous message passing in small cluster environments
International Journal of Computers and Applications
PELCR: Parallel environment for optimal lambda-calculus reduction
ACM Transactions on Computational Logic (TOCL)
Computation-efficient parallel prefix
AIC'06 Proceedings of the 6th WSEAS International Conference on Applied Informatics and Communications
Two families of parallel prefix algorithms for multicomputers
TELE-INFO'08 Proceedings of the 7th WSEAS International Conference on Telecommunications and Informatics
Multiphase Data Exchange in Distributed Logic-Algebraic Based Processing
IEA/AIE '08 Proceedings of the 21st international conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems: New Frontiers in Applied Artificial Intelligence
Parallel prefix algorithms on the multicomputer
WSEAS Transactions on Computer Research
New parallel prefix algorithms
AIC'09 Proceedings of the 9th WSEAS international conference on Applied informatics and communications
A detailed MPI communication model for distributed systems
Future Generation Computer Systems
New families of computation-efficient parallel prefix algorithms
WSEAS Transactions on Computers
An improved model for predicting HPL performance
GPC'07 Proceedings of the 2nd international conference on Advances in grid and pervasive computing
Toward performance models of MPI implementations for understanding application scaling issues
EuroMPI'10 Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface
GREENCOM-CPSCOM '10 Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social Computing
Modeling message-passing overhead on NCHC formosa PC cluster
GPC'06 Proceedings of the First international conference on Advances in Grid and Pervasive Computing
Modeling energy consumption for master---slave applications
The Journal of Supercomputing
Hi-index | 0.00 |
Through timing experiments on the IBM SP2, an overhead quantifying method is developed to evaluate collective communication performance on any message-passing multicomputer. MPP designers and users can apply this method to reveal architectural bottlenecks and to trade off between computations and communications for parallel applications optimization.