Liquid SIMD: Abstracting SIMD Hardware using Lightweight Dynamic Mapping

Authors:
Nathan Clark;Amir Hormati;Sami Yehia;Scott Mahlke;Krisztian Flautner
Affiliations:
Advanced Computer Architecture Laboratory, University of Michigan - Ann Arbor, MI. ntclark@umich.edu;Advanced Computer Architecture Laboratory, University of Michigan - Ann Arbor, MI. hormati@umich.edu;ARM, Ltd., Cambridge, United Kingdom. sami.yehia@arm.com;Advanced Computer Architecture Laboratory, University of Michigan - Ann Arbor, MI. mahlke@umich.edu;ARM, Ltd., Cambridge, United Kingdom. krisztian.flautner@arm.com
Venue:
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Year:
2007

Citing 0
Cited 9

VEAL: Virtualized Execution Accelerator for Loops

ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Toward a multicore architecture for real-time ray-tracing

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Dynamic warp formation: Efficient MIMD control flow on SIMD graphics hardware

ACM Transactions on Architecture and Code Optimization (TACO)
An instruction-systolic programmable shader architecture for multi-threaded 3D graphics processing

Journal of Parallel and Distributed Computing
Mighty-morphing power-SIMD

CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Speculatively vectorized bytecode

Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
Vapor SIMD: Auto-vectorize once, run everywhere

CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
Dynamic compilation of data-parallel kernels for vector processors

Proceedings of the Tenth International Symposium on Code Generation and Optimization
On the advantage of time-varying diversity of workload on functionally asymmetric multi-core

Proceedings of International Workshop on Adaptive Self-tuning Computing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Microprocessor designers commonly utilize SIMD accelerators and their associated instruction set extensions to provide substantial performance gains at a relatively low cost for media applications. One of the most difficult problems with using SIMD accelerators is forward migration to newer generations. With larger hardware budgets and more demands for performance, SIMD accelerators evolve with both larger data widths and increased functionality with each new generation. However, this causes difficult problems in terms of binary compatibility, software migration costs, and expensive pensive redesign of the instruction set architecture. In this work, we propose Liquid SIMD to decouple the instruction set architecture from the SIMD accelerator SIMD instructions are expressed using a processor's baseline scalar instruction set, and light-weight dynamic translation maps the representation onto a broad family of SIMD accelerators. Liquid SIMD effectively bypasses the problems inherent to instruction set modification and binary compatibility across accelerator generations. We provide a detailed description of changes to a compilation framework and processor pipeline needed to support this abstraction. Additionally, we show that the hardware overhead of dynamic optimization is modest, hardware changes do not affect cycle time of the processor, and the performance impact of abstracting the SIMD accelerator is negligible. We conclude that using dynamic techniques to map instructions onto SIMD accelerators is an effective way to improve computation efficiency, without the overhead associated with modifying the instruction set.