Anatomy of a message in the Alewife multiprocessor
ICS '93 Proceedings of the 7th international conference on Supercomputing
Integration of message passing and shared memory in the Stanford FLASH multiprocessor
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Coherent network interfaces for fine-grain communication
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
The iSLIP scheduling algorithm for input-queued switches
IEEE/ACM Transactions on Networking (TON)
Characterizing processor architectures for programmable network interfaces
Proceedings of the 14th international conference on Supercomputing
Smart Memories: a modular reconfigurable architecture
Proceedings of the 27th annual international symposium on Computer architecture
Design Tradeoffs for Embedded Network Processors
ARCS '02 Proceedings of the International Conference on Architecture of Computing Systems: Trends in Network and Pervasive Computing
Network Systems Design Using Network Processors
Network Systems Design Using Network Processors
An automated exploration framework for FPGA-based soft multiprocessor systems
CODES+ISSS '05 Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Pipelined two step iterative matching algorithms for CIOQ crossbar switches
Proceedings of the 2005 ACM symposium on Architecture for networking and communications systems
CommBench-a telecommunications benchmark for network processors
ISPASS '00 Proceedings of the 2000 IEEE International Symposium on Performance Analysis of Systems and Software
Distributed Microarchitectural Protocols in the TRIPS Prototype Processor
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Comparing memory systems for chip multiprocessors
Proceedings of the 34th annual international symposium on Computer architecture
An ILP formulation for system-level application mapping on network processor architectures
Proceedings of the conference on Design, automation and test in Europe
On-chip communication and synchronization mechanisms with cache-integrated network interfaces
Proceedings of the 7th ACM international conference on Computing frontiers
SAMOS'09 Proceedings of the 9th international conference on Systems, architectures, modeling and simulation
Network Processing in Multi-core FPGAs with Integrated Cache-Network Interface
RECONFIG '10 Proceedings of the 2010 International Conference on Reconfigurable Computing and FPGAs
Saturn: a terabit packet switch using dual round robin
IEEE Communications Magazine
Hi-index | 0.00 |
A multicore FPGA platform with cache-integrated network interfaces (NIs) has been developed, appropriate for scalable multicores, that combine the best of two worlds - the flexibility of caches (using implicit communication) and the efficiency of scratchpad memories (using explicit communication). Furthermore, the proposed scheme provides virtualized user-level RDAM capabilities and special hardware primitives (counter, queues) for the communication and synchronization of the cores. This paper presents how the proposed architecture can be utilized in the domain of network processing applications using the hardware synchronization mechanisms. Two representatives network processing benchmarks are used; one for header processing and one for payload processing. The Multiple Reader Queue (MRQ) scheme is utilized in the case of header processing, while in the case of payload processing where transfer of bulk data is required, the user-level RDMA scheme is utilized. These applications are mapped and evaluated to an FPGA platform with up to 24 processors. The performance evaluation in the domain of network processing shows that the proposed scheme can offer low latency communication and increased programming efficiency while it also offloads the processor from the communication and synchronization processes.