Reuse methodology manual: for system-on-a-chip designs
Reuse methodology manual: for system-on-a-chip designs
Discrete-time signal processing (2nd ed.)
Discrete-time signal processing (2nd ed.)
Lx: a technology platform for customizable VLIW embedded processing
Proceedings of the 27th annual international symposium on Computer architecture
Heads and tails: a variable-length instruction format supporting parallel fetch and decode
CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
DSP Processor Fundamentals: Architectures and Features
DSP Processor Fundamentals: Architectures and Features
Computer architecture: a quantitative approach
Computer architecture: a quantitative approach
JPEG Still Image Data Compression Standard
JPEG Still Image Data Compression Standard
Inter-Cluster Communication Models for Clustered VLIW Processors
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Writing Testbenches: Functional Verification of HDL Models, Second Edition
Writing Testbenches: Functional Verification of HDL Models, Second Edition
A unified processor architecture for RISC & VLIW DSP
GLSVLSI '05 Proceedings of the 15th ACM Great Lakes symposium on VLSI
Performance evaluation of ring-structure register file in multimedia applications
ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 2
Register file partitioning and recompilation for register file power reduction
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Register file partitioning and compiler support for reducing embedded processor power consumption
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Journal of Signal Processing Systems
Compiler supports for VLIW DSP processors with SIMD intrinsics
Concurrency and Computation: Practice & Experience
Hi-index | 0.00 |
This paper presents the design and implementation of a novel VLIW digital signal processor (DSP) for multimedia applications. The DSP core embodies a distributed & ping-pong register file, which saves 76.8% silicon area and improves 46.9% access time of centralized ones found in most VLIW processors by restricting its access patterns. However, it still has comparable performance (estimated in cycles) with state-of-the-art DSP for multimedia applications. A hierarchical instruction encoding scheme is also adopted to reduce the program sizes to 24.1~26.0%. The DSP has been fabricated in the UMC 0.13 μm 1P8M Copper Logic Process, and it can operate at 333 MHz while consuming 189 mW power. The core size is 3.2驴脳驴3.15 mm2 including 160 KB on-chip SRAM.