Principles of CMOS VLSI design: a systems perspective
Principles of CMOS VLSI design: a systems perspective
High-performance multi-queue buffers for VLSI communications switches
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
The design of nectar: a network backplane for heterogeneous multicomputers
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Analysis of input and output queueing for nonblocking ATM switches
IEEE/ACM Transactions on Networking (TON)
High-speed switch scheduling for local-area networks
ACM Transactions on Computer Systems (TOCS)
GIGAswitch system: a high-performance packet-switching platform
Digital Technical Journal
Two-dimensional round-robin schedulers for packet switches with multiple input queues
IEEE/ACM Transactions on Networking (TON)
A flexible shared-buffer switch for ATM at Gb/s rates
Computer Networks and ISDN Systems
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Symmetric Crossbar Arbiters for VLSI Communication Switches
IEEE Transactions on Parallel and Distributed Systems
Architecture and Implementation of Vulcan
Proceedings of the 8th International Symposium on Parallel Processing
Proceedings of the 24th annual international symposium on Computer architecture
A new switch chip for IBM RS/6000 SP systems
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
IEEE Transactions on Parallel and Distributed Systems
HIPIQS: A High-Performance Switch Architecture Using Input Queuing
IEEE Transactions on Parallel and Distributed Systems
Impact of the Head-of-Line Blocking on Parallel Computer Networks: Hardware to Applications
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Design Tools for Application Specific Embedded Processors
EMSOFT '02 Proceedings of the Second International Conference on Embedded Software
A hierarchical modeling framework for on-chip communication architectures
Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Orion: a power-performance simulator for interconnection networks
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
On the Design of a High-Performance Adaptive Router for CC-NUMA Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Pipelined Multi-Queue Management in a VLSI ATM Switch Chip with Credit-Based Flow-Control
ARVLSI '97 Proceedings of the 17th Conference on Advanced Research in VLSI (ARVLSI '97)
Telegraphos: High-Performance Networking for Parallel Processing on Workstation Clusters
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Using Remote Memory to avoid Disk Thrashing: A Simulation Study
MASCOTS '96 Proceedings of the 4th International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
HIPIQS: A High-Performance Switch Architecture using Input Queuing
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Multistage-Based Switching Fabrics for Scalable Routers
IEEE Transactions on Parallel and Distributed Systems
Rotary router: an efficient architecture for CMP interconnection networks
Proceedings of the 34th annual international symposium on Computer architecture
Hi-index | 0.00 |
Switch chips are building blocks for computer and communication systems. Switches need internal buffering, because of output contention; shared buffering is known to perform better than multiple input queues or buffers, and the VLSI implementation of the former is not more expensive than the latter. We present a new organization for a shared buffer with its associated switching and cut-through functions. It is simpler and smaller than wide or interleaved organizations, and it is particularly suitable for VLSI technologies. It is based on multiple memory banks, addressed in a pipelined fashion. The first word of a packet is transferred to/from the first bank, followed by a "wave" of similar operations for the remaining words in the remaining banks. An FPGA-based prototype is operational, while standard-cell and full-custom chips are being submitted for fabrication. Simulation of the full-custom version indicates that, even in a conservative 1-micron CMOS technology, a 64 Kbit central buffer for an 8×8 switch operates at 1 Gbps/link (worst case) and fits in 45 mm2 including crossbar and cut-through.