Shade: a fast instruction-set simulator for execution profiling
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Value locality and load value prediction
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Exceeding the dataflow limit via value prediction
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Low power data processing by elimination of redundant computations
ISLPED '97 Proceedings of the 1997 international symposium on Low power electronics and design
Proceedings of the 24th annual international symposium on Computer architecture
The predictability of data values
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Dynamic removal of redundant computations
ICS '99 Proceedings of the 13th international conference on Supercomputing
Compiler-directed dynamic computation reuse: rationale and initial results
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Limits of Data Value Predictability
International Journal of Parallel Programming
Extending Value Reuse to Basic Blocks with Compiler Support
IEEE Transactions on Computers
ACM Transactions on Computer Systems (TOCS)
Automatic source code specialization for energy reduction
ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
Exploiting Value Locality to Exceed the Dataflow Limit
International Journal of Parallel Programming
Balancing Reuse Opportunities and Performance Gains with Subblock Value Reuse
IEEE Transactions on Computers
Region-level approximate computation reuse for power reduction in multimedia applications
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
By-passing the out-of-order execution pipeline to increase energy-efficiency
Proceedings of the 4th international conference on Computing frontiers
Partial resolution for redundant operation table
Microprocessors & Microsystems
Minimal Multi-threading: Finding and Removing Redundant Instructions in Multi-threaded Processors
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Utilizing dynamic data value localities in internal variables
PDCAT'04 Proceedings of the 5th international conference on Parallel and Distributed Computing: applications and Technologies
Improving MPI applications with a new MPI_Info and the use of the memoization
Proceedings of the 20th European MPI Users' Group Meeting
Hi-index | 0.01 |
This report introduces the notion of trivial computation,where the appearance of simple operands reduces the complexity of apotentially difficult operation. An example of a trivial operationis integer divide-by-two; the division becomes a simple shiftoperation. Also discussed is the concept of redundantcomputation, where some operation repeatedly does the same functionbecause it repeatedly sees the same operands. Using two separatebenchmark suites, the SPEC benchmarks and the Perfect Club, andconcentrating on multiplication, we find a surprising amount oftrivial and redundant operation. Various architectural means ofexploiting this knowledge to improve computational efficiencyinclude detection of trivial operands, memoization,and the result cache.