Automatic tuning technique exploring within the hardware-specific constrained parameters

  • Authors:
  • Toshiyuki Imamura;Ken Naono

  • Affiliations:
  • Department of Computer Science, The University of Electro-Communications, Tokyo, Japan;Central Research Laboratory, Hitachi Ltd., Tokyo, Japan

  • Venue:
  • LSSC'05 Proceedings of the 5th international conference on Large-Scale Scientific Computing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper covers an efficient strategy for exploring the sampling parameters on auto-tuning processes. Byte/flop is considered as a performance indicator, and finding the best parameter is interpreted as an optimisation problem with some hardware-specific constrained conditions. In this work, we also evaluate the performance of various unrolled loops both in a rank-update operation and a matrix-vector multiplication which appear in a significant operation of an eigensolver. The tuned routines running on a single processor of a Hitachi SR8000 and a Fujitsu VPP5000 record 1080 MFLOPS and 8342 MFLOPS respectively.