Numerical computation of internal & external flows: fundamentals of numerical discretization
Numerical computation of internal & external flows: fundamentals of numerical discretization
Optimizing tridiagonal solvers for alternating direction methods on Boolean cube multiprocessors
SIAM Journal on Scientific and Statistical Computing
The effect of time constraints on scaled speedup
SIAM Journal on Scientific and Statistical Computing
Efficient Tridiagonal Solvers on Multicomputers
IEEE Transactions on Computers
Introduction to parallel computing: design and analysis of algorithms
Introduction to parallel computing: design and analysis of algorithms
Journal of Parallel and Distributed Computing - Special issue on scalability of parallel algorithms and architectures
CCL: A Portable and Tunable Collective Communication Library for Scalable Parallel Computers
IEEE Transactions on Parallel and Distributed Systems
Application and accuracy of the parallel diagonal dominant algorithm
Parallel Computing
Integrated Range Comparison for Data-Parallel Compilation Systems
IEEE Transactions on Parallel and Distributed Systems
Scalable Parallel Computing: Technology,Architecture,Programming
Scalable Parallel Computing: Technology,Architecture,Programming
Scalability versus execution time in scalable systems
Journal of Parallel and Distributed Computing
Isoefficiency: Measuring the Scalability of Parallel Algorithms and Architectures
IEEE Parallel & Distributed Technology: Systems & Technology
Performance Metrics: Keeping the Focus on Runtime
IEEE Parallel & Distributed Technology: Systems & Technology
Performance Prediction: A Case Study Using a Scalable Shared-Virtual-Memory Machine
IEEE Parallel & Distributed Technology: Systems & Technology
Scalability of Parallel Algorithm-Machine Combinations
IEEE Transactions on Parallel and Distributed Systems
Performance Considerations of Shared Virtual Memory Machines
IEEE Transactions on Parallel and Distributed Systems
Scalability versus execution time in scalable systems
Journal of Parallel and Distributed Computing
On performance analysis of heterogeneous parallel algorithms
Parallel Computing
Speedup and scalability analysis of Master--Slave applications on large heterogeneous clusters
Journal of Parallel and Distributed Computing
Algorithm-system scalability of heterogeneous computing
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
Parallel programming is elusive. The relative performance of different parallel implementations varies with machine architecture, system and problem size. How to compare different implementations over a wide range of machine architectures and problem sizes has not been well addressed due to its difficulty. Scalability has been proposed in recent years to reveal scaling properties of parallel algorithms and machines. In this paper, the relation between scalability and execution time is carefully studied. The concepts of crossing point analysis and range comparison are introduced. Crossing point analysis finds slow/fast performance crossing points of parallel algorithms and machines. Range comparison compares performance over a wide range of ensemble and problem size via scalability and crossing point analysis. Three algorithms from scientific computing are implemented on an Intel Paragon and an IBM SP2 parallel computer. Experimental and theoretical results show how the combination of scalability, crossing point analysis, and range comparison provides a practical solution for scalable performance evaluation and prediction. While our testings are conducted on homogeneous parallel computers, the proposed methodology applies to heterogeneous and network computing as well.