Communications of the ACM - Special section on computer architecture
Multicomputer networks: message-based parallel processing
Multicomputer networks: message-based parallel processing
Solving problems on concurrent processors. Vol. 1: General techniques and regular problems
Solving problems on concurrent processors. Vol. 1: General techniques and regular problems
Partitioning Problems in Parallel, Pipeline, and Distributed Computing
IEEE Transactions on Computers
Performance Modeling Based on Real Data: A Case Study
IEEE Transactions on Computers - Fault-Tolerant Computing
C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
The iPSC/2 direct-connect communications technology
C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
Mark IIIfp hypercube concurrent processor architecture
C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
High performance hypercube communications
C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
Networks for parallel processors: measurements and prognostications
C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
A parallel branch and bound algorithm for test generation
DAC '89 Proceedings of the 26th ACM/IEEE Design Automation Conference
A parallel row-based algorithm for standard cell placement with integrated error control
DAC '89 Proceedings of the 26th ACM/IEEE Design Automation Conference
Analysis of the Effects of Delays on Load Sharing
IEEE Transactions on Computers
A message passing coprocessor for distributed memory multicomputers
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Improving AP1000 parallel computer performance with message communication
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
PP-MESS-SIM: A Flexible and Extensible Simulator for Evaluating Multicomputer Networks
IEEE Transactions on Parallel and Distributed Systems
Impact of selection functions on routing algorithm performance in multicomputer networks
ICS '97 Proceedings of the 11th international conference on Supercomputing
ICS '97 Proceedings of the 11th international conference on Supercomputing
A Router Architecture for Flexible Routing and Switching in Multihop Point-To-Point Networks
IEEE Transactions on Parallel and Distributed Systems
High-Performance Routing in Networks of Workstations with Irregular Topology
IEEE Transactions on Parallel and Distributed Systems
On the Use of Virtual Channels in Networks of Workstations with Irregular Topology
IEEE Transactions on Parallel and Distributed Systems
Design and Evaluation of Hardware Strategies for Reconfiguring Hypercubes and Meshes Under Faults
IEEE Transactions on Computers
Impact of Virtual Channels and Adaptive Routing on Application Performance
IEEE Transactions on Parallel and Distributed Systems
Deadlock-free connection-based adaptive routing with dynamic virtual circuits
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
The performance evaluation, workload characterization, and trace-driven simulation of a hypercube multicomputer running realistic workloads are presented. Eleven representative parallel applications were selected as benchmarks. Software monitoring techniques were then used to collect execution traces. Based on the measurement results, both the computation and communication behavior of these parallel programs were investigated. The various time interval distributions were modeled by statistical functions which were verified by a nonlinear regression technique using the empirical data. The temporal and spatial localities of message destinations were also studied. A model for the temporal locality of message length was introduced and used to analyze the communication traces. A trace-drive simulation environment, which uses the communication patterns of the parallel programs as inputs, was developed to study the behavior of the communication hardware under real workload. Simulation results on DMA and link utilizations are reported.