Effective compiler support for predicated execution using the hyperblock
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
The superblock: an effective technique for VLIW and superscalar compilation
The Journal of Supercomputing - Special issue on instruction-level parallelism
Value locality and load value prediction
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Exceeding the dataflow limit via value prediction
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
The predictability of data values
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Can program profiling support value prediction?
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Highly accurate data value prediction using hybrid predictors
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Value speculation scheduling for high performance processors
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Global Context-Based Value Prediction
HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
HLS: combining statistical and symbolic simulation to guide microprocessor designs
Proceedings of the 27th annual international symposium on Computer architecture
Efficient checker processor design
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Compiler controlled value prediction using branch predictor based confidence
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Modeling Value Speculation: An Optimal Edge Selection Problem
IEEE Transactions on Computers
Load Redundancy Removal through Instruction Reuse
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Hi-index | 0.00 |
The performance of VLIW architectures is dependent on the capability of the compiler to detect and exploit instruction-level parallelism during instruction scheduling. To exploit the detected parallelism, instructions are reordered to reduce the length of the code schedule and minimize the cycle count for execution. Code reordering is limited by the dependencies among instructions arising from both control flow and data flow. In this paper, we present the design of a VLIW architecture that uses value prediction to remove data dependencies and improve the instruction schedule. Our architecture consists of two execution engines, one for executing the original VLIW code, and the other for executing compensation code after a misprediction. Any code executed due to mispredictions is executed in parallel with the VLIW instructions. The instruction set and hardware of a traditional VLIW machine are modified accordingly to support this type of concurrent execution. The efficacy of the proposed architecture is demonstrated by implementing the prediction model in the Trimaran compiler infrastructure and studying the speedups that result due to the parallel execution of compensation code.