Implementing Precise Interrupts in Pipelined Processors
IEEE Transactions on Computers
A variable instruction stream extension to the VLIW architecture
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
The expandable split window paradigm for exploiting fine-grain parallelsim
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Processor coupling: integrating compile time and runtime scheduling for parallelism
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Simultaneous multithreading: maximizing on-chip parallelism
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Proceedings of the 28th annual international symposium on Microarchitecture
ARB: A Hardware Mechanism for Dynamic Reordering of Memory References
IEEE Transactions on Computers
Dynamically scheduled VLIW processors
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Threaded multiple path execution
Proceedings of the 25th annual international symposium on Computer architecture
A dynamic multithreading processor
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Treegion Scheduling for Wide Issue Processors
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
Improving quasi-dynamic schedules through region slip
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Extended Split-Issue: Enabling Flexibility in the Hardware Implementation of NUAL VLIW DSPs
Proceedings of the 31st annual international symposium on Computer architecture
Virtual multiprocessor: an analyzable, high-performance architecture for real-time computing
Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems
High-Performance and Low-Cost Dual-Thread VLIW Processor Using Weld Architecture Paradigm
IEEE Transactions on Parallel and Distributed Systems
Distributed loop controller architecture for multi-threading in uni-threaded VLIW processors
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Multiple Instruction Stream Processor
Proceedings of the 33rd annual international symposium on Computer Architecture
International Journal of Parallel Programming
Dynamic instruction scheduling in a trace-based multi-threaded architecture
International Journal of Parallel Programming
PATMOS'06 Proceedings of the 16th international conference on Integrated Circuit and System Design: power and Timing Modeling, Optimization and Simulation
Hi-index | 0.00 |
This paper presents a new architecture model, named Weld, for VLIW processors. Weld integrates multithreading support into a VLIW processor to hide run-time latency effects that cannot be determined by the compiler. It does this through a novel hardware technique called operation welding that merges operations from different threads to utilize the hardware resources more efficiently. Hardware contexts such as program counters and fetch units are duplicated to support multithreading. The experimental results show that the Weld architecture attains a maximum of 27% speedup as compared to a single-threaded VLIW architecture.