The effect of instruction set complexity on program size and memory performance
ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
Effective compiler support for predicated execution using the hyperblock
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
The superblock: an effective technique for VLIW and superscalar compilation
The Journal of Supercomputing - Special issue on instruction-level parallelism
Guarded execution and branch prediction in dynamic ILP processors
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
A comparison of full and partial predicated execution support for ILP processors
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Enhancing loop buffering of media and telecommunications applications using low-overhead predication
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Value Prediction as a Cost-Effective Solution to Improve Embedded Processors Performance
VECPAR '00 Selected Papers and Invited Talks from the 4th International Conference on Vector and Parallel Processing
Design principles for synthesizable processor cores
ARCS'12 Proceedings of the 25th international conference on Architecture of Computing Systems
Hi-index | 0.00 |
Growing demand for high performance in embedded systems is creating new opportunities for Instruction-Level Parallelism (ILP) techniques that are traditionally used in high performance systems. Predicated execution, an important ILP technique, can be used to improve branch handling, reduce frequently mispredicted branches, and expose multiple execution paths to hardware resources. However, there is a major tradeoff in the design of the instruction set, the addition of a predicate operand for all instructions. We propose a new architecture framework for introducing predicated execution to embedded designs. Experimental results show a 10% performance improvement and a code reduction of 25% over a traditionally predicated architecture.