Constant propagation with conditional branches
ACM Transactions on Programming Languages and Systems (TOPLAS)
Advanced compiler design and implementation
Advanced compiler design and implementation
Numerical computing with IEEE floating point arithmetic
Numerical computing with IEEE floating point arithmetic
Accuracy and Stability of Numerical Algorithms
Accuracy and Stability of Numerical Algorithms
On Infinitely Precise Rounding for Division, Square Root, Reciprocal and Square Root Reciprocal
ARITH '99 Proceedings of the 14th IEEE Symposium on Computer Arithmetic
Elementary Functions: Algorithms and Implementation
Elementary Functions: Algorithms and Implementation
Resource-Constrained Project Scheduling: Models, Algorithms, Extensions and Applications
Resource-Constrained Project Scheduling: Models, Algorithms, Extensions and Applications
Procedure placement using temporal-ordering information: Dealing with code size expansion
Journal of Embedded Computing - Cache exploitation in embedded systems
Revisiting Out-of-SSA Translation for Correctness, Code Quality and Efficiency
Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
ARITH '09 Proceedings of the 2009 19th IEEE Symposium on Computer Arithmetic
Certification of bounds on expressions involving rounded operators
ACM Transactions on Mathematical Software (TOMS)
Handbook of Floating-Point Arithmetic
Handbook of Floating-Point Arithmetic
Optimizing correctly-rounded reciprocal square roots for embedded VLIW cores
Asilomar'09 Proceedings of the 43rd Asilomar conference on Signals, systems and computers
Hi-index | 0.00 |
Recently, some high-performance IEEE 754 single precision floating-point software has been designed, which aims at best exploiting some features (integer arithmetic, parallelism) of the STMicroelectronics ST200 Very Long Instruction Word (VLIW) processor. We review here the techniques and software tools used or developed for this design and its implementation, and how they allowed very high instruction-level parallelism (ILP) exposure. Those key points include a hierarchical description of function evaluation algorithms, the exploitation of the standard encoding of floating-point data, the automatic generation of fast and accurate polynomial evaluation schemes, and some compiler optimizations.