Principles of CMOS VLSI design: a systems perspective
Principles of CMOS VLSI design: a systems perspective
A processor architecture for horizon
Proceedings of the 1988 ACM/IEEE conference on Supercomputing
Processor coupling: integrating compile time and runtime scheduling for parallelism
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Instruction-level parallel processing: history, overview, and perspective
The Journal of Supercomputing - Special issue on instruction-level parallelism
Simultaneous multithreading: maximizing on-chip parallelism
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Dynamically scheduled VLIW processors
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Comparing power consumption of an SMT and a CMP DSP for mobile phone workloads
CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
Parallel Algorithms and Architectures for DSP Applications
Parallel Algorithms and Architectures for DSP Applications
The TigerSHARC DSP Architecture
IEEE Micro
Weld: A Multithreading Technique Towards Latency-Tolerant VLIW Processors
HiPC '01 Proceedings of the 8th International Conference on High Performance Computing
Real-Time Parallel MPEG-2 Decoding in Software
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Very Long Instruction Word Architectures for Digital Signal Processing
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 1 - Volume 1
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Virtual multiprocessor: an analyzable, high-performance architecture for real-time computing
Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems
An embedded coherent-multithreading multimedia processor and its programming model
Proceedings of the 44th annual Design Automation Conference
Support for dynamic issue width in VLIW processors using generic binaries
Proceedings of the Conference on Design, Automation and Test in Europe
Hi-index | 0.00 |
VLIW architecture based DSPs have become widespread due to thecombined benefits of simple hardware and compiler-extractedinstruction-level parallelism. However, the VLIW instruction setarchitecture and its hardware implementation are tightly coupled,especially so for Non-Unit Assumed Latency (NUAL) VLIWs. Theproblem of object code compatibility across processors having differentnumbers of functional units or hardware latencies has beenthe Achilles' heel of this otherwise powerful architecture. In thispaper, we propose eXtended Split-Issue (XSI), a novel mechanismthat breaks the instruction packet syntax of an NUAL VLIW compilerwithout violating the dataflow dependences. XSI provides a designerthe freedom of disassociating the hardware implementation of theNUAL VLIW processor from the instruction set architecture. Further,we investigate fairly radical (in the context of VLIW) changes to thehardware-like removing an adder, adding a multiplier, and incorporatingsimultaneous multithreading (SMT)-to show that ourtechnique works for a variety of hardware configurations withoutcompromising on performance. The technique can be used in bothsingle-threaded and multi-threaded architectures to achieve a levelof flexibility heretofore unavailable in the VLIW arena.