Software pipelining: an effective scheduling technique for VLIW machines
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
What every computer scientist should know about floating-point arithmetic
ACM Computing Surveys (CSUR)
Sentinel scheduling: a model for compiler-controlled speculative execution
ACM Transactions on Computer Systems (TOCS)
The SPARC architecture manual (version 9)
The SPARC architecture manual (version 9)
Dynamic memory disambiguation using the memory conflict buffer
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Unrolling-based optimizations for modulo scheduling
Proceedings of the 28th annual international symposium on Microarchitecture
PA-RISC 2.0 architecture
Exceeding the dataflow limit via value prediction
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Can program profiling support value prediction?
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Value speculation scheduling for high performance processors
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Design, implementation, and evaluation of optimizations in a just-in-time compiler
JAVA '99 Proceedings of the ACM 1999 conference on Java Grande
DIVA: a reliable substrate for deep submicron microarchitecture design
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Proceedings of the 14th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
From flop to megaflops: Java for technical computing
ACM Transactions on Programming Languages and Systems (TOPLAS)
The Art of Computer Programming Volumes 1-3 Boxed Set
The Art of Computer Programming Volumes 1-3 Boxed Set
The Java Class Libraries Volume 1: java.io, java.lang, java.math, ,java.net, java.text,,java.util
The Java Class Libraries Volume 1: java.io, java.lang, java.math, ,java.net, java.text,,java.util
Java Virtual Machine Specification
Java Virtual Machine Specification
Java Language Specification, Second Edition: The Java Series
Java Language Specification, Second Edition: The Java Series
Optimizing Precision Overhead for x86 Processors
Proceedings of the 2nd Java Virtual Machine Research and Technology Symposium
Stride prefetching by dynamically inspecting objects
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
A region-based compilation technique for a Java just-in-time compiler
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Effectiveness of cross-platform optimizations for a java just-in-time compiler
OOPSLA '03 Proceedings of the 18th annual ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applications
Overview of the IBM Java just-in-time compiler
IBM Systems Journal
Hi-index | 0.02 |
The bit-by-bit reproducibility of floating-point results, which is defined by the IEEE 754 standard, prohibits optimizations such as reassociation and the use of native operations such as fused multiply-add (FMA), and thus it significantly impairs floating-point performance. Recent network-oriented languages such as Java strictly conform to the standard, and thus their numerical computing performance becomes inherently lower than conventional languages.In this paper, we propose a new software technique, called floating-point (FP) speculation, to optimize floating-point performance while preserving the bit-by-bit reproducibility of the results. We execute the fast unsafe code and the slow verification code in parallel. The unsafe code does not wait for the verification code, and is immediately followed by the subsequent code that uses the probable result from the unsafe code assuming the speculation will succeed. The improvement from FP speculation results from this earlier start of the subsequent code.Unlike other speculation techniques, FP speculation does not require any special instructions or hardware support. Rather, it exploits unused floating-point registers and execution units. Therefore it is generally applicable for processor architectures that have sufficient floating-point resources.