Interval arithmetic and automatic error analysis in digital computing
Interval arithmetic and automatic error analysis in digital computing
Word-length optimization for differentiable nonlinear systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
MPFR: A multiple-precision binary floating-point library with correct rounding
ACM Transactions on Mathematical Software (TOMS)
FPGA acceleration of a quantum Monte Carlo application
Parallel Computing
Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Efficient reconfigurable design for pricing asian options
ACM SIGARCH Computer Architecture News
FPGA Acceleration of MultiFactor CDO Pricing
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Mixed Precision Processing in Reconfigurable Systems
FCCM '11 Proceedings of the 2011 IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines
A Run-Time Adaptive FPGA Architecture for Monte Carlo Simulations
FPL '11 Proceedings of the 2011 21st International Conference on Field Programmable Logic and Applications
ASC: a stream compiler for computing with FPGAs
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Hi-index | 0.00 |
This paper introduces a novel mixed precision methodology applicable to any Monte Carlo (MC) simulation. It involves the use of data-paths with reduced precision, and the resulting errors are corrected by auxiliary sampling. An analytical model is developed for a reconfigurable accelerator system with a field-programmable gate array (FPGA) and a general purpose processor (GPP). Optimisation based on mixed integer geometric programming is employed for determining the optimal reduced precision and optimal resource allocation among the MC data-paths and correction datapaths. Experiments show that the proposed mixed precision methodology requires up to 11 % additional evaluations while less than 4 % of all the evaluations are computed in the reference precision; the resulting designs are up to 7.1 times faster and 3.1 times more energy efficient than baseline double precision FPGA designs, and up to 163 times faster and 170 times more energy efficient than quad-core software designs optimised with the Intel compiler and Math Kernel Library. Our methodology also produces designs for pricing Asian options which are 4.6 times faster and 5.5 times more energy efficient than NVIDIA Tesla C2070 GPU implementations.