The TigerSHARC DSP Architecture

Authors:
Jose Fridman;Zvi Greenfield
Affiliations:
-;-
Venue:
IEEE Micro
Year:
2000

Citing 6
Cited 39

Introduction to parallel computing: design and analysis of algorithms

Introduction to parallel computing: design and analysis of algorithms
Intel MMX for multimedia PCs

Communications of the ACM
Computer architecture (2nd ed.): a quantitative approach

Computer architecture (2nd ed.): a quantitative approach
Parallel Processing: From Applications to Systems

Parallel Processing: From Applications to Systems
DSP Processors Hit the Mainstream

Computer
A new parallel DSP with short-vector memory architecture

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 04

Modulo scheduling for a fully-distributed clustered VLIW architecture

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
An interleaved cache clustered VLIW processor

ICS '02 Proceedings of the 16th international conference on Supercomputing
Graph-partitioning based instruction scheduling for clustered processors

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Modulo scheduling with integrated register spilling for clustered VLIW architectures

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Exploiting Pseudo-Schedules to Guide Data Dependence Graph Partitioning

Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
A Java-Enabled DSP

Embedded Processor Design Challenges: Systems, Architectures, Modeling, and Simulation - SAMOS
A Register File Architecture and Compilation Scheme for Clustered ILP Processors

Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
A Java-enabled DSP

Embedded processor design challenges
Effective instruction scheduling techniques for an interleaved cache clustered VLIW processor

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Local scheduling techniques for memory coherence in a clustered VLIW processor with a distributed data cache

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Interface Design Techniques for Single-Chip Systems

VLSID '03 Proceedings of the 16th International Conference on VLSI Design
Bottlenecks in Multimedia Processing with SIMD Style Extensions and Architectural Enhancements

IEEE Transactions on Computers
Instruction Replication for Clustered Microarchitectures

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Flexible Compiler-Managed L0 Buffers for Clustered VLIW Processors

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Integrated temporal and spatial scheduling for extended operand clustered VLIW processors

Proceedings of the 1st conference on Computing frontiers
Extended Split-Issue: Enabling Flexibility in the Hardware Implementation of NUAL VLIW DSPs

Proceedings of the 31st annual international symposium on Computer architecture
Efficient orchestration of sub-word parallelism in media processors

Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
Removing communications in clustered microarchitectures through instruction replication

ACM Transactions on Architecture and Code Optimization (TACO)
Demystifying on-the-fly spill code

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Distributed Data Cache Designs for Clustered VLIW Processors

IEEE Transactions on Computers
Future wireless convergence platforms

CODES+ISSS '05 Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Exploiting Vector Parallelism in Software Pipelined Loops

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Software and hardware techniques to optimize register file utilization in VLIW architectures

International Journal of Parallel Programming
A Low-Power Multithreaded Processor for Software Defined Radio

Journal of VLSI Signal Processing Systems
Virtual Cluster Scheduling Through the Scheduling Graph

Proceedings of the International Symposium on Code Generation and Optimization
Heterogeneous Clustered VLIW Microarchitectures

Proceedings of the International Symposium on Code Generation and Optimization
Vector processing as an enabler for software-defined radio in handheld devices

EURASIP Journal on Applied Signal Processing
An integrated ARM and multi-core DSP simulator

CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
INTACTE: an interconnect area, delay, and energy estimation tool for microarchitectural explorations

CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
High-performance and low-power VLIW cores for numerical computations

International Journal of High Performance Computing and Networking
Configurable data memory for multimedia processing

Journal of Signal Processing Systems - Special Issue: Embedded computing systems for DSP
From SODA to scotch: The evolution of a wireless baseband processor

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Trends in low power handset software defined radio

SAMOS'07 Proceedings of the 7th international conference on Embedded computer systems: architectures, modeling, and simulation
Compiler-assisted power optimization for clustered VLIW architectures

Parallel Computing
A low-power DSP for wireless communications

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Analyzing the Next Generation Software Defined Radio for Future Architectures

Journal of Signal Processing Systems
WCET-aware re-scheduling register allocation for real-time embedded systems with clustered VLIW architecture

Proceedings of the 13th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, Tools and Theory for Embedded Systems
Compiler-assisted energy optimization for clustered VLIW processors

Journal of Parallel and Distributed Computing
CAeSaR: unified cluster-assignment scheduling and communication reuse for clustered VLIW processors

Proceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems

Quantified Score

Hi-index	0.01

Visualization

Abstract

This highly parallel DSP architecture based on a short-vector memory system incorporates techniques found in general-purpose computing. It promises sustained performance close to its peak computational rates of 900 Mflops (32-bit floating-point) or 3.6 BOPS (16-bit fixed-point).