Debug Facilities in the TriMedia CPU64 Architecture
Journal of Electronic Testing: Theory and Applications - special issue on the European test workshop 1999
Hardware-Software partitioning and pipelined scheduling of transformative applications
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Eclipse: Heterogeneous Multiprocessor Architecture for Flexible Media Processing
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
A Reconfigurable Functional Unit for TriMedia/CPU64. A Case Study
Embedded Processor Design Challenges: Systems, Architectures, Modeling, and Simulation - SAMOS
MDSP: A High-Performance Low-Power DSP Architecture
PATMOS '02 Proceedings of the 12th International Workshop on Integrated Circuit Design. Power and Timing Modeling, Optimization and Simulation
A reconfigurable functional unit for TriMedia/CPU64. A case study
Embedded processor design challenges
A new look at exploiting data parallelism in embedded systems
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Pel reconstruction on FPGA-augmented TriMedia
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
IEEE-Compliant IDCT on FPGA-Augmented TriMedia
Journal of VLSI Signal Processing Systems
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
IEEE-compliant IDCT on FPGA-augmented TriMedia
Journal of VLSI Signal Processing Systems
A Low-Power Multithreaded Processor for Software Defined Radio
Journal of VLSI Signal Processing Systems
Challenges in exploitation of loop parallelism in embedded applications
CODES+ISSS '06 Proceedings of the 4th international conference on Hardware/software codesign and system synthesis
Inter-cluster communication in VLIW architectures
ACM Transactions on Architecture and Code Optimization (TACO)
On the exploitation of loop-level parallelism in embedded applications
ACM Transactions on Embedded Computing Systems (TECS)
Support for dynamic issue width in VLIW processors using generic binaries
Proceedings of the Conference on Design, Automation and Test in Europe
Shared-port register file architecture for low-energy VLIW processors
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
We present a new VLIW core as a successor to the TriMedia TM1000. The processor is targeted for embedded use in media-processing devices like DTVs and set-top boxes. Intended as a core, its design must be supplemented with on-chip co-processors to obtain a cost-effective system. Good performance is obtained through a uniform 64-bit 5 issue-slot VLIW design, supporting sub-word parallelism with an extensive instruction set optimized with respect to media-processing. Multi-slot 'super-ops' allow powerful multi-argument and multi-result operations. As an example, an IDCT algorithm shows a very low instruction count in comparison with other processors. To achieve good performance, critical sections in the application program source code need to be rewritten with vector data types and function calls for media operations. Benchmarking with several media applications was used to tune the instruction set and study cache behavior. This resulted in a VLIW architecture with wide data paths and relatively simple cpu control.