Liquid SIMD: Abstracting SIMD Hardware using Lightweight Dynamic Mapping

  • Authors:
  • Nathan Clark;Amir Hormati;Sami Yehia;Scott Mahlke;Krisztian Flautner

  • Affiliations:
  • Advanced Computer Architecture Laboratory, University of Michigan - Ann Arbor, MI. ntclark@umich.edu;Advanced Computer Architecture Laboratory, University of Michigan - Ann Arbor, MI. hormati@umich.edu;ARM, Ltd., Cambridge, United Kingdom. sami.yehia@arm.com;Advanced Computer Architecture Laboratory, University of Michigan - Ann Arbor, MI. mahlke@umich.edu;ARM, Ltd., Cambridge, United Kingdom. krisztian.flautner@arm.com

  • Venue:
  • HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Microprocessor designers commonly utilize SIMD accelerators and their associated instruction set extensions to provide substantial performance gains at a relatively low cost for media applications. One of the most difficult problems with using SIMD accelerators is forward migration to newer generations. With larger hardware budgets and more demands for performance, SIMD accelerators evolve with both larger data widths and increased functionality with each new generation. However, this causes difficult problems in terms of binary compatibility, software migration costs, and expensive pensive redesign of the instruction set architecture. In this work, we propose Liquid SIMD to decouple the instruction set architecture from the SIMD accelerator SIMD instructions are expressed using a processor's baseline scalar instruction set, and light-weight dynamic translation maps the representation onto a broad family of SIMD accelerators. Liquid SIMD effectively bypasses the problems inherent to instruction set modification and binary compatibility across accelerator generations. We provide a detailed description of changes to a compilation framework and processor pipeline needed to support this abstraction. Additionally, we show that the hardware overhead of dynamic optimization is modest, hardware changes do not affect cycle time of the processor, and the performance impact of abstracting the SIMD accelerator is negligible. We conclude that using dynamic techniques to map instructions onto SIMD accelerators is an effective way to improve computation efficiency, without the overhead associated with modifying the instruction set.