Low power data processing by elimination of redundant computations
ISLPED '97 Proceedings of the 1997 international symposium on Low power electronics and design
Proceedings of the 24th annual international symposium on Computer architecture
Power considerations in the design of the Alpha 21264 microprocessor
DAC '98 Proceedings of the 35th annual Design Automation Conference
Pipeline gating: speculation control for energy reduction
Proceedings of the 25th annual international symposium on Computer architecture
Accelerating multi-media processing by implementing memoing in multiplication and division units
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Dynamic removal of redundant computations
ICS '99 Proceedings of the 13th international conference on Supercomputing
Compiler-directed dynamic computation reuse: rationale and initial results
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Latency and energy aware value prediction for high-frequency processors
ICS '02 Proceedings of the 16th international conference on Supercomputing
Cache performance for selected SPEC CPU2000 benchmarks
ACM SIGARCH Computer Architecture News
An efficient static analysis algorithm to detect redundant memory operations
Proceedings of the 2002 workshop on Memory system performance
A Power Perspective of Value Speculation for Superscalar Microprocessors
ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
Improving Processor Performance by Simplifying and Bypassing Trivial Computations
ICCD '02 Proceedings of the 2002 IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD'02)
Partial Resolution in Data Value Predictors
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Power Issues Related to Branch Prediction
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
ON DIVISION AND RECIPROCAL CACHES
ON DIVISION AND RECIPROCAL CACHES
Caching Function Results: Faster Arithmetic by Avoiding Unnecessary Computation
Caching Function Results: Faster Arithmetic by Avoiding Unnecessary Computation
A Compiler Scheme for Reusing Intermediate Computation Results
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Improving Energy-Efficiency by Bypassing Trivial Computations
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 11 - Volume 12
MinneSPEC: A New SPEC Benchmark Workload for Simulation-Based Computer Architecture Research
IEEE Computer Architecture Letters
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Evaluating trace cache energy efficiency
ACM Transactions on Architecture and Code Optimization (TACO)
Using branch prediction information for near-optimal i-cache leakage
ACSAC'06 Proceedings of the 11th Asia-Pacific conference on Advances in Computer Systems Architecture
Hi-index | 0.00 |
In this work, we discuss several drawbacks of the conventional wide-width redundant operation table such as the waste of area cost and power consumption. We found that the waste of area cost and power consumption is caused by storing meaningless bits of the narrow-width operand values. Based on this analysis, we propose a way to avoid these storing of meaningless information of the narrow-width operands. The proposed method, partial resolution method, divides the conventional wide-width redundant operation table into two tables as the wide-width table for the half entries and the narrow-width table for the other half entries. The wide-width and the narrow-width redundant operation tables store different dynamic instructions whose operand values are wide and narrow, respectively. Since the narrow-width redundant operation table stores smaller number of bits, it requires lower area cost and also power consumption compared with the wide-width redundant operation table. The partial resolution method decreases the area cost by about 7% and 20% for the integer and the floating-point tables, respectively, and reduces the dynamic power consumption by about 34% and 30% for the integer and the floating-point tables, respectively, compared with the conventional wide-width redundant operation table with 2K entries. Meanwhile, the performance simulation with a high-end microarchitecture model and SPEC2000 benchmarks shows that the partial resolution method affects the performance very little, and even increases slightly in terms of IPC (Instruction per Cycle) value.