MOVE: a framework for high-performance processor design
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Communications of the ACM
Vector vs. superscalar and VLIW architectures for embedded multimedia benchmarks
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Evaluating the Imagine Stream Architecture
Proceedings of the 31st annual international symposium on Computer architecture
The Vector-Thread Architecture
Proceedings of the 31st annual international symposium on Computer architecture
An Architecture Framework for Transparent Instruction Set Customization in Embedded Processors
Proceedings of the 32nd annual international symposium on Computer Architecture
Bypass aware instruction scheduling for register file power reduction
Proceedings of the 2006 ACM SIGPLAN/SIGBED conference on Language, compilers, and tool support for embedded systems
SODA: A Low-power Architecture For Software Radio
Proceedings of the 33rd annual international symposium on Computer Architecture
Power Reduction in VLIW Processor with Compiler Driven Bypass Network
VLSID '07 Proceedings of the 20th International Conference on VLSI Design held jointly with 6th International Conference: Embedded Systems
Vector processing as an enabler for software-defined radio in handheld devices
EURASIP Journal on Applied Signal Processing
Larrabee: a many-core x86 architecture for visual computing
ACM SIGGRAPH 2008 papers
Computer
From SODA to scotch: The evolution of a wireless baseband processor
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
ASAP '08 Proceedings of the 2008 International Conference on Application-Specific Systems, Architectures and Processors
A customized cross-bar for data-shuffling in domain-specific simd processors
ARCS'07 Proceedings of the 20th international conference on Architecture of computing systems
The next generation challenge for software defined radio
SAMOS'07 Proceedings of the 7th international conference on Embedded computer systems: architectures, modeling, and simulation
Register File Power Reduction Using Bypass Sensitive Compiler
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Customizing wide-SIMD architectures for H.264
SAMOS'09 Proceedings of the 9th international conference on Systems, architectures, modeling and simulation
Understanding sources of inefficiency in general-purpose chips
Proceedings of the 37th annual international symposium on Computer architecture
Domain specific architecture for next generation wireless communication
Proceedings of the Conference on Design, Automation and Test in Europe
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Understanding sources of ineffciency in general-purpose chips
Communications of the ACM
An energy-efficient patchable accelerator for post-silicon engineering changes
CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Instructions and hardware designs for accelerating SNOW 3G on a software-defined radio platform
Analog Integrated Circuits and Signal Processing
SIMD defragmenter: efficient ILP realization on data-parallel architectures
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Exploiting both pipelining and data parallelism with SIMD reconfigurable architecture
ARC'12 Proceedings of the 8th international conference on Reconfigurable Computing: architectures, tools and applications
Energy efficient special instruction support in an embedded processor with compact isa
Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systems
Libra: Tailoring SIMD Execution Using Heterogeneous Hardware and Dynamic Configurability
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.02 |
In the past decade, the proliferation of mobile devices has increased at a spectacular rate. There are now more than 3.3 billion active cell phones in the world-a device that we now all depend on in our daily lives. The current generation of devices employs a combination of general-purpose processors, digital signal processors, and hardwired accelerators to provide giga-operations-per-second performance on milliWatt power budgets. Such heterogeneous organizations are inefficient to build and maintain, as well as waste silicon area and power. Looking forward to the next generation of mobile computing, computation requirements will increase by one to three orders of magnitude due to higher data rates, increased complexity algorithms, and greater computation diversity but the power requirements will be just as stringent. Scaling of existing approaches will not suffice instead the inherent computational efficiency, programmability, and adaptability of the hardware must change. To overcome these challenges, this paper proposes an example architecture, referred to as AnySP, for the next generation mobile signal processing. AnySP uses a co-design approach where the next generation wireless signal processing and high-definition video algorithms are analyzed to create a domain specific programmable architecture. At the heart of AnySP is a configurable single-instruction multiple-data datapath that is capable of processing wide vectors or multiple narrow vectors simultaneously. In addition, deeper computation subgraphs can be pipelined across the single-instruction multiple-data lanes. These three operating modes provide high throughput across varying application types. Results show that AnySP is capable of sustaining 4G wireless processing and high-definition video throughput rates, and will approach the 1000 Mops/mW efficiency barrier when scaled to 45nm.