The accelerator store: A shared memory framework for accelerator-based systems

Authors:
Michael J. Lyons;Mark Hempstead;Gu-Yeon Wei;David Brooks
Affiliations:
Harvard University, Cambridge, MA;Drexel University;Harvard University, Cambridge, MA;Harvard University, Cambridge, MA
Venue:
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Year:
2012

Citing 11
Cited 2

Smart Memories: a modular reconfigurable architecture

Proceedings of the 27th annual international symposium on Computer architecture
Hardware support for real-time embedded multiprocessor system-on-a-chip memory management

Proceedings of the tenth international symposium on Hardware/software codesign
Performance Evaluation and Design Trade-Offs for Network-on-Chip Interconnect Architectures

IEEE Transactions on Computers
Measuring the gap between FPGAs and ASICs

Proceedings of the 2006 ACM/SIGDA 14th international symposium on Field programmable gate arrays
Software architecture exploration for high-performance security processing on a multiprocessor mobile SoC

Proceedings of the 43rd annual Design Automation Conference
Unified microprocessor core storage

Proceedings of the 4th international conference on Computing frontiers
Architectural implications of brick and mortar silicon manufacturing

Proceedings of the 34th annual international symposium on Computer architecture
Merge: a programming model for heterogeneous multi-core systems

Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Disaggregated memory for expansion and sharing in blade servers

Proceedings of the 36th annual international symposium on Computer architecture
Understanding sources of inefficiency in general-purpose chips

Proceedings of the 37th annual international symposium on Computer architecture
The Accelerator Store framework for high-performance, low-power accelerator-based systems

IEEE Computer Architecture Letters

BiN: a buffer-in-NUCA scheme for accelerator-rich CMPs

Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design
Optimization of interconnects between accelerators and shared memories in dark silicon

Proceedings of the International Conference on Computer-Aided Design

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents the many-accelerator architecture, a design approach combining the scalability of homogeneous multi-core architectures and system-on-chip's high performance and power-efficient hardware accelerators. In preparation for systems containing tens or hundreds of accelerators, we characterize a diverse pool of accelerators and find each contains significant amounts of SRAM memory (up to 90% of their area). We take advantage of this discovery and introduce the accelerator store, a scalable architectural component to minimize accelerator area by sharing its memories between accelerators. We evaluate the accelerator store for two applications and find significant system area reductions (30%) in exchange for small overheads (2% performance, 0%--8% energy). The paper also identifies new research directions enabled by the accelerator store and the many-accelerator architecture.