The TigerSHARC DSP Architecture
IEEE Micro
Bottlenecks in Multimedia Processing with SIMD Style Extensions and Architectural Enhancements
IEEE Transactions on Computers
A new look at exploiting data parallelism in embedded systems
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Design Methodology for a Tightly Coupled VLIW/Reconfigurable Matrix Architecture: A Case Study
Proceedings of the conference on Design, automation and test in Europe - Volume 2
SODA: A Low-power Architecture For Software Radio
Proceedings of the 33rd annual international symposium on Computer Architecture
The M5 Simulator: Modeling Networked Systems
IEEE Micro
Wireless Local-Area Network Fundamentals
Wireless Local-Area Network Fundamentals
Vector processing as an enabler for software-defined radio in handheld devices
EURASIP Journal on Applied Signal Processing
The Chameleon architecture for streaming DSP applications
EURASIP Journal on Embedded Systems
A baseband processor for software defined radio terminals
A baseband processor for software defined radio terminals
A software-defined communications baseband design
IEEE Communications Magazine
A 100 GOPS ASP based baseband processor for wireless communication
Proceedings of the Conference on Design, Automation and Test in Europe
Hi-index | 0.00 |
This paper proposes a low-power high-throughput digital signal processor (DSP) for baseband processing in wireless terminals. It builds on our earlier architecture--Signal processing On Demand Architecture (SODA)--which is a four-processor, 32-lane SIMD machine that was optimized for WCDMA 2 Mbps and IEEE 802.11a. SODA has several shortcomings including large register file power, wasted cycles for data alignment, etc., and cannot satisfy the higher throughput and lower power requirements of emerging standards. We propose SODA-II, which addresses these problems by deploying the following schemes: operation chaining, pipelined execution of SIMD units, staggered memory access, and multicycling of computation units. Operation chaining involves chaining the primitive instructions, thereby eliminating unnecessary register file accesses and saving power. Pipelined execution of the vector instructions through the SIMD units improves the system throughput. Staggered execution of computation units helps simplify the data alignment networks. It is implemented in conjunction with multicycling so that the computation units are busy most of the time. The proposed architecture is evaluated with an in-house architecture emulator which uses component-level area and power models built with Synopsys and Artisan tools. Our results show that for WCDMA 2 Mbps, the proposed architecture uses two processors and consumes only 120 mW while SODA uses four processors and consumes 210 mW when implemented in 0.13- m technology and clocked at 300 MHz.