AnySP: anytime anywhere anyway signal processing

Authors:
Mark Woh;Sangwon Seo;Scott Mahlke;Trevor Mudge;Chaitali Chakrabarti;Krisztian Flautner
Affiliations:
University of Michigan, Ann Arbor, USA;University of Michigan, Ann Arbor, USA;University of Michigan, Ann Arbor, USA;University of Michigan, Ann Arbor, USA;Arizona State University, Tempe, AZ, USA;ARM, Ltd.
Venue:
Proceedings of the 36th annual international symposium on Computer architecture
Year:
2009

Citing 19
Cited 12

MOVE: a framework for high-performance processor design

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Intel MMX for multimedia PCs

Communications of the ACM
Implementing Streaming SIMD Extensions on the Pentium III Processor

IEEE Micro
Vector vs. superscalar and VLIW architectures for embedded multimedia benchmarks

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Evaluating the Imagine Stream Architecture

Proceedings of the 31st annual international symposium on Computer architecture
The Vector-Thread Architecture

Proceedings of the 31st annual international symposium on Computer architecture
An Architecture Framework for Transparent Instruction Set Customization in Embedded Processors

Proceedings of the 32nd annual international symposium on Computer Architecture
Bypass aware instruction scheduling for register file power reduction

Proceedings of the 2006 ACM SIGPLAN/SIGBED conference on Language, compilers, and tool support for embedded systems
SODA: A Low-power Architecture For Software Radio

Proceedings of the 33rd annual international symposium on Computer Architecture
Power Reduction in VLIW Processor with Compiler Driven Bypass Network

VLSID '07 Proceedings of the 20th International Conference on VLSI Design held jointly with 6th International Conference: Embedded Systems
Vector processing as an enabler for software-defined radio in handheld devices

EURASIP Journal on Applied Signal Processing
Software-Defined Radio Prospects for Multistandard Mobile Phones

Computer
Larrabee: a many-core x86 architecture for visual computing

ACM SIGGRAPH 2008 papers
Efficient Embedded Computing

Computer
From SODA to scotch: The evolution of a wireless baseband processor

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Reducing power consumption of embedded processors through register file partitioning and compiler support

ASAP '08 Proceedings of the 2008 International Conference on Application-Specific Systems, Architectures and Processors
A customized cross-bar for data-shuffling in domain-specific simd processors

ARCS'07 Proceedings of the 20th international conference on Architecture of computing systems
The next generation challenge for software defined radio

SAMOS'07 Proceedings of the 7th international conference on Embedded computer systems: architectures, modeling, and simulation
Register File Power Reduction Using Bypass Sensitive Compiler

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Customizing wide-SIMD architectures for H.264

SAMOS'09 Proceedings of the 9th international conference on Systems, architectures, modeling and simulation
Understanding sources of inefficiency in general-purpose chips

Proceedings of the 37th annual international symposium on Computer architecture
Domain specific architecture for next generation wireless communication

Proceedings of the Conference on Design, Automation and Test in Europe
Mighty-morphing power-SIMD

CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Understanding sources of ineffciency in general-purpose chips

Communications of the ACM
An energy-efficient patchable accelerator for post-silicon engineering changes

CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Instructions and hardware designs for accelerating SNOW 3G on a software-defined radio platform

Analog Integrated Circuits and Signal Processing
SIMD defragmenter: efficient ILP realization on data-parallel architectures

ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Exploiting both pipelining and data parallelism with SIMD reconfigurable architecture

ARC'12 Proceedings of the 8th international conference on Reconfigurable Computing: architectures, tools and applications
Energy efficient special instruction support in an embedded processor with compact isa

Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systems
Libra: Tailoring SIMD Execution Using Heterogeneous Hardware and Dynamic Configurability

MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
An energy-efficient method of supporting flexible special instructions in an embedded processor with compact ISA

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.02

Visualization

Abstract

In the past decade, the proliferation of mobile devices has increased at a spectacular rate. There are now more than 3.3 billion active cell phones in the world-a device that we now all depend on in our daily lives. The current generation of devices employs a combination of general-purpose processors, digital signal processors, and hardwired accelerators to provide giga-operations-per-second performance on milliWatt power budgets. Such heterogeneous organizations are inefficient to build and maintain, as well as waste silicon area and power. Looking forward to the next generation of mobile computing, computation requirements will increase by one to three orders of magnitude due to higher data rates, increased complexity algorithms, and greater computation diversity but the power requirements will be just as stringent. Scaling of existing approaches will not suffice instead the inherent computational efficiency, programmability, and adaptability of the hardware must change. To overcome these challenges, this paper proposes an example architecture, referred to as AnySP, for the next generation mobile signal processing. AnySP uses a co-design approach where the next generation wireless signal processing and high-definition video algorithms are analyzed to create a domain specific programmable architecture. At the heart of AnySP is a configurable single-instruction multiple-data datapath that is capable of processing wide vectors or multiple narrow vectors simultaneously. In addition, deeper computation subgraphs can be pipelined across the single-instruction multiple-data lanes. These three operating modes provide high throughput across varying application types. Results show that AnySP is capable of sustaining 4G wireless processing and high-definition video throughput rates, and will approach the 1000 Mops/mW efficiency barrier when scaled to 45nm.