Performance Analysis of Parallel Processing Systems
IEEE Transactions on Software Engineering
Scheduling precedence graphs in systems with interprocessor communication times
SIAM Journal on Computing
Acyclic fork-join queuing networks
Journal of the ACM (JACM)
Implementation and evaluation of Hough transform algorithms on a shared-memory multiprocessor
Journal of Parallel and Distributed Computing - Special issue on shared-memory multiprocessors
A dynamic processor allocation policy for multiprogrammed shared-memory multiprocessors
ACM Transactions on Computer Systems (TOCS)
Models of machines and computation for mapping in multicomputers
ACM Computing Surveys (CSUR)
A performance evaluation of several priority policies for parallel processing systems
Journal of the ACM (JACM)
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Parallel image processing applications on a network of workstations
Parallel Computing
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Parallel Computing in Networks of Workstations with Paralex
IEEE Transactions on Parallel and Distributed Systems
Network-Based Multicomputers: A Practical Supercomputer Architecture
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Effects of communication latency, overhead, and bandwidth in a cluster architecture
Proceedings of the 24th annual international symposium on Computer architecture
Scheduling Algorithms for Parallel Gaussian Elimination With Communication Costs
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
On Parallelizing the Multiprocessor Scheduling Problem
IEEE Transactions on Parallel and Distributed Systems
Journal of Parallel and Distributed Computing
Static scheduling algorithms for allocating directed task graphs to multiprocessors
ACM Computing Surveys (CSUR)
A Framework for Computer Performance Evaluation Using Benchmark Sets
IEEE Transactions on Computers
Task Allocation on a Network of Processors
IEEE Transactions on Computers
LoGPC: Modeling Network Contention in Message-Passing Programs
IEEE Transactions on Parallel and Distributed Systems
A Slowdown Model for Applications Executing on Time-Shared Clusters of Workstations
IEEE Transactions on Parallel and Distributed Systems
Scheduling Divisible Loads in Parallel and Distributed Systems
Scheduling Divisible Loads in Parallel and Distributed Systems
Analysis of Fork-Join Program Response Times on Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Performance Analysis of Parallelizing Compilers on the Perfect Benchmarks Programs
IEEE Transactions on Parallel and Distributed Systems
Load Balancing Requirements in Parallel Implementations of Image Feature Extraction Tasks
IEEE Transactions on Parallel and Distributed Systems
Analysis of Processor Allocation in Multiprogrammed, Distributed-Memory Parallel Processing Systems
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Multiprocessor Scheduling with the Aid of Network Flow Algorithms
IEEE Transactions on Software Engineering
IEEE Transactions on Software Engineering
Paper: Performance of the Intel iPSC/860 and Ncube 6400 hypercubes
Parallel Computing
Hi-index | 0.00 |
Message-passing network-based multicomputer systems emerge as a potential economical candidate to replace supercomputers. Despite enormous effort to evaluate the performance of those systems and to determine an optimum scheduling algorithm (which is known as an NP-complete), we still lack a complete and a good performance model to analyze distributed computing systems. The model is complete if all system parameters, network parameters, communication overhead parameters, and application parameters are considered explicitly in the solution. A good performance model, like a good scientific theory, should be able to explain all normal behavior, predict any abnormality in the system, and allow the designer to adjust some of the parameters, while abstracting unimportant details. In this paper, we develop a good and complete performance model, which predicts a minimum finish time, equally the maximum speed up. In addition, we develop a closed form solution which forecasts the optimum share of the parallel job (task) that has to be assigned to each processor (node). Task assignment may then be undertaken in a distributed manner, which enhances the distributive nature of the system and, thus, improve system performance. Most importantly, our analytical solution presents a mechanism to select, based on system and application parameters, the optimum number of processors (nodes) that has to be assigned to a given parallel job. The model helps the designer to study the effect of each individual parameter on the overall system performance. This then becomes a tool for a designer of a multicomputer system to manage limited resources in an optimal manner paying attention only to those parameters that are most critical.