A VLIW architecture for a trace Scheduling Compiler
IEEE Transactions on Computers - Special issue on architectural support for programming languages and operating systems
Integer Multiplication and Division on the HP Precision Architecture
IEEE Transactions on Computers - Special issue on architectural support for programming languages and operating systems
Reduced instruction set computers
Communications of the ACM - Special section on computer architecture
Computer Arithmetic: Principles, Architecture and Design
Computer Arithmetic: Principles, Architecture and Design
A flexible VLSI core for an adaptable architecture
MICRO 22 Proceedings of the 22nd annual workshop on Microprogramming and microarchitecture
How many operation units are adequate?
ACM SIGARCH Computer Architecture News
Parallelizing Applications into Silicon
FCCM '99 Proceedings of the Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Quality-Driven Proactive Computation Elimination for Power-Aware Multimedia Processing
Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
ACM SIGARCH Computer Architecture News
Improving Floating-Point Performance in Less Area: Fractured Floating Point Units (FFPUs)
Journal of Signal Processing Systems
Improving processor efficiency by statically pipelining instructions
Proceedings of the 14th ACM SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems
Hi-index | 0.00 |
This paper describes micro-optimization, a technique for reducing the operation count and time required to perform floating-point calculations. Micro-optimization involves breaking floating-point operations into their constituent micro-operations and optimizing the resulting code. Exposing micro-operations to the compiler creates many opportunities for optimization. Redundant normalization operations can be eliminated or combined. Also, scheduling micro-operations separately allows dependent operations to be partially overlapped. A prototype expression compiler has been written to evaluate a number of micro-optimizations. On a set of benchmark expressions operation count is reduced by 33% and execution time is reduced by 40%.