Computer arithmetic algorithms
Computer arithmetic algorithms
Evaluation of A+B=K Conditions Without Carry Propagation
IEEE Transactions on Computers
Parallel reduced area multipliers
Journal of VLSI Signal Processing Systems - Special issue on application-specific array processors
Exceeding the dataflow limit via value prediction
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Clock rate versus IPC: the end of the road for conventional microarchitectures
Proceedings of the 27th annual international symposium on Computer architecture
Imprecise data computation for high performance asynchronous processors
Proceedings of the 2001 Asia and South Pacific Design Automation Conference
Synthesis and Optimization of Digital Circuits
Synthesis and Optimization of Digital Circuits
ICCD '96 Proceedings of the 1996 International Conference on Computer Design, VLSI in Computers and Processors
A Result Forwarding Mechanism for Asynchronous Pipelined Systems
ASYNC '97 Proceedings of the 3rd International Symposium on Advanced Research in Asynchronous Circuits and Systems
Speculative Completion for the Design of High-Performance Asynchronous Dynamic Adders
ASYNC '97 Proceedings of the 3rd International Symposium on Advanced Research in Asynchronous Circuits and Systems
ASYNC '01 Proceedings of the 7th International Symposium on Asynchronous Circuits and Systems
The Half-Adder Form and Early Branch Condition Resolution
ARITH '97 Proceedings of the 13th Symposium on Computer Arithmetic (ARITH '97)
Circuit optimization using carry-save-adder cells
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Hi-index | 0.02 |
Instruction level parallelism (ILP) is strictly limited by various dependencies. In particular, data dependency is a major performance bottleneck of data intensive applications. In this paper we address acceleration of the execution of instruction codes serialized by data dependencies. We propose a new computer architecture supporting a redundant number computation at the instruction level. To design and implement the scheme, an extended data-path and additional instructions are also proposed. The architectural exploitation of instruction level redundant number computations (IL-RNC) makes it possible to eliminate carry propagations. As a result execution of instructions which are serialized due to inherent data dependencies is accelerated. Simulations have been performed with data intensive processing benchmarks and the proposed architecture shows about a 1.2-1.35 fold speedup over a conventional counterpart. The proposed architecture model can be used effectively for data intensive processing in a microprocessor, a digital signal processor and a multimedia processor.