An Energy-Efficient Processor Architecture for Embedded Systems

Authors:
James Balfour;William Dally;David Black-Schaffer;Vishal Parikh;JongSoo Park
Affiliations:
Stanford University, Stanford;Stanford University, Stanford;Stanford University, Stanford;Stanford University, Stanford;Stanford University, Stanford
Venue:
IEEE Computer Architecture Letters
Year:
2008

Citing 0
Cited 12

Buffer-space efficient and deadlock-free scheduling of stream applications on multi-core architectures

Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Understanding sources of inefficiency in general-purpose chips

Proceedings of the 37th annual international symposium on Computer architecture
Fine-grain dynamic instruction placement for L0 scratch-pad memory

CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Understanding sources of ineffciency in general-purpose chips

Communications of the ACM
NEMTronics: Symbiotic integration of nanoelectronic and nanomechanical devices for energy-efficient adaptive computing

NANOARCH '11 Proceedings of the 2011 IEEE/ACM International Symposium on Nanoscale Architectures
Energy efficient special instruction support in an embedded processor with compact isa

Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systems
Convolution engine: balancing efficiency & flexibility in specialized computing

Proceedings of the 40th Annual International Symposium on Computer Architecture
Systematic evaluation of workload clustering for extremely energy-efficient architectures

ACM SIGARCH Computer Architecture News
Scheduling for register file energy minimization in explicit datapath architectures

DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
An energy-efficient method of supporting flexible special instructions in an embedded processor with compact ISA

ACM Transactions on Architecture and Code Optimization (TACO)
2013 Special Issue: FPGA implementation of a configurable neuromorphic CPG-based locomotion controller

Neural Networks
Energy efficient computation: A silicon perspective

Integration, the VLSI Journal

Quantified Score

Hi-index	0.02

Visualization

Abstract

We present an efficient programmable architecture for compute-intensive embedded applications. The processor architecture uses instruction registers to reduce the cost of delivering instructions, and a hierarchical and distributed data register organization to deliver data. Instruction registers capture instruction reuse and locality in inexpensive storage structures that are located near to the functional units. The data register organization captures reuse and locality in different levels of the hierarchy to reduce the cost of delivering data. Exposed communication resources eliminate pipeline registers and control logic, and allow the compiler to schedule efficient instruction and data movement. The architecture keeps a significant fraction of instruction and data bandwidth local to the functional units, which reduces the cost of supplying instructions and data to large numbers of functional units. This architecture achieves an energy efficiency that is 23脳 greater than an embedded RISC processor.