Toward an analytical solution to task allocation, processor assignment, and performance evaluation of network processors

Authors:
Sameer M. Bataineh
Affiliations:
Department of Computer Engineering, Jordon University of Science and Technology, P. O. Box 3030, Irbid 22110, Jordan
Venue:
Journal of Parallel and Distributed Computing
Year:
2005

Citing 33
Cited 0

Performance Analysis of Parallel Processing Systems

IEEE Transactions on Software Engineering
Scheduling precedence graphs in systems with interprocessor communication times

SIAM Journal on Computing
Acyclic fork-join queuing networks

Journal of the ACM (JACM)
Implementation and evaluation of Hough transform algorithms on a shared-memory multiprocessor

Journal of Parallel and Distributed Computing - Special issue on shared-memory multiprocessors
A dynamic processor allocation policy for multiprogrammed shared-memory multiprocessors

ACM Transactions on Computer Systems (TOCS)
Models of machines and computation for mapping in multicomputers

ACM Computing Surveys (CSUR)
A performance evaluation of several priority policies for parallel processing systems

Journal of the ACM (JACM)
LogGP: incorporating long messages into the LogP model—one step closer towards a realistic model for parallel computation

Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Parallel image processing applications on a network of workstations

Parallel Computing
The SPLASH-2 programs: characterization and methodological considerations

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Parallel Computing in Networks of Workstations with Paralex

IEEE Transactions on Parallel and Distributed Systems
Network-Based Multicomputers: A Practical Supercomputer Architecture

IEEE Transactions on Parallel and Distributed Systems
Minimizing the Application Execution Time Through Scheduling of Subtasks and Communication Traffic in a Heterogeneous Computing System

IEEE Transactions on Parallel and Distributed Systems
Effects of communication latency, overhead, and bandwidth in a cluster architecture

Proceedings of the 24th annual international symposium on Computer architecture
Scheduling Algorithms for Parallel Gaussian Elimination With Communication Costs

IEEE Transactions on Parallel and Distributed Systems
Performance-Based Path Determination for Interprocessor Communication in Distributed Computing Systems

IEEE Transactions on Parallel and Distributed Systems
On Parallelizing the Multiprocessor Scheduling Problem

IEEE Transactions on Parallel and Distributed Systems
Load partitioning and trade-off study for large matrix-vector computations in multicast bus networks with communication delays

Journal of Parallel and Distributed Computing
Static scheduling algorithms for allocating directed task graphs to multiprocessors

ACM Computing Surveys (CSUR)
A Framework for Computer Performance Evaluation Using Benchmark Sets

IEEE Transactions on Computers
Task Allocation on a Network of Processors

IEEE Transactions on Computers
LoGPC: Modeling Network Contention in Message-Passing Programs

IEEE Transactions on Parallel and Distributed Systems
A Slowdown Model for Applications Executing on Time-Shared Clusters of Workstations

IEEE Transactions on Parallel and Distributed Systems
Scheduling Divisible Loads in Parallel and Distributed Systems

Scheduling Divisible Loads in Parallel and Distributed Systems
Analysis of Fork-Join Program Response Times on Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
Performance Analysis of Parallelizing Compilers on the Perfect Benchmarks Programs

IEEE Transactions on Parallel and Distributed Systems
Load Balancing Requirements in Parallel Implementations of Image Feature Extraction Tasks

IEEE Transactions on Parallel and Distributed Systems
Analysis of Processor Allocation in Multiprogrammed, Distributed-Memory Parallel Processing Systems

IEEE Transactions on Parallel and Distributed Systems
Optimal Sequencing and Arrangement in Distributed Single-Level Tree Networks with Communication Delays

IEEE Transactions on Parallel and Distributed Systems
Multiprocessor Scheduling with the Aid of Network Flow Algorithms

IEEE Transactions on Software Engineering
A Shortest Tree Algorithm for Optimal Assignments Across Space and Time in a Distributed Processor System

IEEE Transactions on Software Engineering
Paper: Performance of the Intel iPSC/860 and Ncube 6400 hypercubes

Parallel Computing
Research: Effect of fault tolerance and communication delay on response time in a multiprocessor system with a bus topology

Computer Communications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Message-passing network-based multicomputer systems emerge as a potential economical candidate to replace supercomputers. Despite enormous effort to evaluate the performance of those systems and to determine an optimum scheduling algorithm (which is known as an NP-complete), we still lack a complete and a good performance model to analyze distributed computing systems. The model is complete if all system parameters, network parameters, communication overhead parameters, and application parameters are considered explicitly in the solution. A good performance model, like a good scientific theory, should be able to explain all normal behavior, predict any abnormality in the system, and allow the designer to adjust some of the parameters, while abstracting unimportant details. In this paper, we develop a good and complete performance model, which predicts a minimum finish time, equally the maximum speed up. In addition, we develop a closed form solution which forecasts the optimum share of the parallel job (task) that has to be assigned to each processor (node). Task assignment may then be undertaken in a distributed manner, which enhances the distributive nature of the system and, thus, improve system performance. Most importantly, our analytical solution presents a mechanism to select, based on system and application parameters, the optimum number of processors (nodes) that has to be assigned to a given parallel job. The model helps the designer to study the effect of each individual parameter on the overall system performance. This then becomes a tool for a designer of a multicomputer system to manage limited resources in an optimal manner paying attention only to those parameters that are most critical.