Fast Quadruple Precision Arithmetic Library on Parallel Computer SR11000/J2

  • Authors:
  • Takahiro Nagai;Hitoshi Yoshida;Hisayasu Kuroda;Yasumasa Kanada

  • Affiliations:
  • Dept. of Frontier Informatics, The University of Tokyo, Tokyo, Japan;Dept. of Frontier Informatics, The University of Tokyo, Tokyo, Japan;Dept. of Frontier Informatics, The University of Tokyo, Tokyo, Japan and The Information Technology Center, The University of Tokyo, Tokyo, Japan;Dept. of Frontier Informatics, The University of Tokyo, Tokyo, Japan and The Information Technology Center, The University of Tokyo, Tokyo, Japan

  • Venue:
  • ICCS '08 Proceedings of the 8th international conference on Computational Science, Part I
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, the fast quadruple precision arithmetic of four kinds of basic operations and multiply-add operations are introduced. The proposed methods provide a maximum speed-up factor of 5 times to gcc 4.1.1 with POWER 5+ processor used on parallel computer SR11000/J2. We also developed the fast quadruple precision vector library optimized on POWER 5 architecture. Quadruple precision numbers, which is 128 bit long double data type, are emulated with a pair of 64 bit double data type on POWER 5+ prosessor used on SR11000/J2 with Hitachi Optimizing Compiler and gcc 4.1.1. To avoid rounding errors in computing quadruple precision arithmetic operations, emulation needs high computational cost. The proposed methods focus on optimizing the number of registers and instruction latency.