Fine-grain performance scaling of soft vector processors
CASES '09 Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systems
A soft multi-core architecture for edge detection and data analysis of microarray images
Journal of Systems Architecture: the EUROMICRO Journal
Automatic multithreaded pipeline synthesis from transactional datapath specifications
Proceedings of the 47th Design Automation Conference
The case for hardware transactional memory in software packet processing
Proceedings of the 6th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
Application-specific signatures for transactional memory in soft processors
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Portable, flexible, and scalable soft vector processors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Hi-index | 0.00 |
As FPGA-based systems including soft-processors become increasingly common we are motivated to better understand the best way to scale the performance of such systems. In this paper we explore the organization of processors and caches connected to a single off-chip memory channel, for workloads composed of many independent threads. In particular we design and evaluate real FPGA-based processor, multithreaded processor, and multiprocessor systems on EEMBC benchmarks---investigating different approaches to scaling caches, processors, and thread contexts to maximize throughput while minimizing area. Our main finding is that while a single multithreaded processor offers improved performance over a single-threaded processor, multiprocessors composed of single-threaded processors scale better than those composed of multithreaded processors.