Reconfiguration Procedures for a Polymorphic and Partitionable Multiprocessor
IEEE Transactions on Computers
Computer
On mapping parallel algorithms into parallel architectures
Journal of Parallel and Distributed Computing
The characteristics of parallel algorithms
The characteristics of parallel algorithms
IEEE Transactions on Computers
Network locality at the scale of processes
ACM Transactions on Computer Systems (TOCS)
Performance Analysis of Multistage Interconnection Network Configurations and Operations
IEEE Transactions on Computers
Applications of the “phase abstractions” for portable and scalable parallel programming
Languages, compilers and run-time environments for distributed memory machines
NetSim: a tool for modeling the performance of circuit switched multicomputer networks
Proceedings of the 7th international conference on Computer performance evaluation : modelling techniques and tools: modelling techniques and tools
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
Embedding Classical Communication Topologies in the Scalable OPAM Architecture
IEEE Transactions on Parallel and Distributed Systems
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
A collision model for randomized routing in fat-tree networks
Journal of Parallel and Distributed Computing
Reconfigurable hybrid interconnection for static and dynamic scientific applications
Proceedings of the 4th international conference on Computing frontiers
Proceedings of the 9th conference on Computing Frontiers
A reconfigurable, regular-topology cluster/datacenter network using commodity optical switches
Future Generation Computer Systems
Hi-index | 0.00 |
In many parallel applications, each computation entity (process, thread etc.) switches the bulk of its communication between a small group of other entities. We call this phenomenon switching locality. The Interconnection Cached Network (ICN) is a reconfigurable network especially suited for exploiting switching locality. It consists of many small, fast crossbars interconnected by a large, slow switching crossbar. The large crossbar is used for topology reconfiguration and the smaller crossbars for circuit switching. For a large class of communication patterns displaying switching locality (this includes meshes, tori, trees, rings, pyramids, etc.), it is possible to choose appropriate ICN configurations and assignments of processes to processors such that all communication paths pass through two or less switching components.Much of the previous work on performance analysis of networks has assumed random, uniformly distributed communication and is inapplicable to many real-life parallel applications that lack this uniformity. We develop a methodology to analyze the performance of synchronous, circuit switched networks under different communication traffic patterns. We employ this methodology to study the performance of the ICN in comparison to more popular reconfigurable networks: the delta and the crossbar. We choose two different communication patterns—a 2-D torus representing a high degree of switching locality and a fully connected graph representing complete absence of such locality. We show that in the presence of locality, the ICN comes very close to matching the crossbar's performance. This, together with the shorter network cycle period of the ICN, makes it more desirable. In the absence of switching locality, the reconfigurability of the ICN allows for a graceful degradation in performance.