Interconnection networks for large-scale parallel processing: theory and case studies (2nd ed.)
Interconnection networks for large-scale parallel processing: theory and case studies (2nd ed.)
Parallel algorithms for placement and routing in VLSI design
Parallel algorithms for placement and routing in VLSI design
Highly parallel computing (2nd ed.)
Highly parallel computing (2nd ed.)
Virtual memory mapped network interface for the SHRIMP multicomputer
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
The SP2 high-performance switch
IBM Systems Journal
Towards an active network architecture
ACM SIGCOMM Computer Communication Review
IEEE Transactions on Parallel and Distributed Systems
Early experience with message-passing on the SHRIMP multicomputer
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Implementation of an efficient parallel BDD package
DAC '96 Proceedings of the 33rd annual Design Automation Conference
Architecture and implementation of MEMORY CHANNEL 2
Digital Technical Journal
MPI: The Complete Reference
Assessing Fast Network Interfaces
IEEE Micro
Binary decision diagrams on network of workstation
ICCD '96 Proceedings of the 1996 International Conference on Computer Design, VLSI in Computers and Processors
A Comparison of Three High Speed Networks for Parallel Cluster Computing
CANPC '97 Proceedings of the First International Workshop on Communication and Architectural Support for Network-Based Parallel Computing
SPDP '96 Proceedings of the 8th IEEE Symposium on Parallel and Distributed Processing (SPDP '96)
Parallel simulated annealing strategies for VLSI cell placement
VLSID '96 Proceedings of the 9th International Conference on VLSI Design: VLSI in Mobile Communication
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Object-oriented aggregate networks
Object-oriented aggregate networks
The NYU Ultracomputer Designing an MIMD Shared Memory Parallel Computer
IEEE Transactions on Computers
Sorting networks and their applications
AFIPS '68 (Spring) Proceedings of the April 30--May 2, 1968, spring joint computer conference
A survey of active network research
IEEE Communications Magazine
Cyclical Cascade Chains: A Dynamic Barrier Synchronization Mechanism for Multiprocessor Systems
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Hi-index | 0.00 |
Parallel processing is based on utilizing a group of processors to efficiently solve large problems faster than is possible on a single processor. To accomplish this, the processors must communicate and coordinate with each other through some type of network. However, the only function that most networks support is message routing. Consequently, functions that involve data from a group of processors must be implemented on top of message routing. We propose treating the network switch as a function unit that can receive data from a group of processors, execute operations, and return the result(s) to the appropriate processors. This paper describes how each of the architectural resources that are typically found in a network switch can be better utilized as a centralized function unit. A proof-of-concept prototype called ClusterNet4EPP has been implemented to demonstrate feasibility of this concept.