VLIW coprocessor for IEEE-754 quadruple-precision elementary functions

  • Authors:
  • Yuanwu Lei;Yong Dou;Lei Guo;Jinbo Xu;Jie Zhou;Yazhuo Dong;Hongjian Li

  • Affiliations:
  • National University of Defense Technology, Changsha, China;National University of Defense Technology, Changsha, China;National University of Defense Technology, Changsha, China;National University of Defense Technology, Changsha, China;National University of Defense Technology, Changsha, China;People's Liberation Army, Beijing, China;Logistics Scientific Institute, Beijing, China

  • Venue:
  • ACM Transactions on Architecture and Code Optimization (TACO)
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this article, a unified VLIW coprocessor, based on a common group of atomic operation units, for Quad arithmetic and elementary functions (QP_VELP) is presented. The explicitly parallel scheme of VLIW instruction and Estrin's evaluation scheme for polynomials are used to improve the performance. A two-level VLIW instruction RAM scheme is introduced to achieve high scalability and customizability, even for more complex key program kernels. Finally, the Quad arithmetic accelerator (QAA) with the QP_VELP array is implemented on ASIC. Compared with hyper-thread software implementation on an Intel Xeon E5620, QAA with 8 QP_VELP units achieves improvement by a factor of 18X.