The potential energy efficiency of vector acceleration

Authors:
Christophe Lemuet;Jack Sampson;Jean-Francois Collard;Norm Jouppi
Affiliations:
Hewlett-Packard Laboratories, Palo Alto, California;UCSD;Hewlett-Packard Laboratories, Palo Alto, California;Hewlett-Packard Laboratories, Palo Alto, California
Venue:
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Year:
2006

Citing 11
Cited 3

Adding a vector unit to a superscalar processor

ICS '99 Proceedings of the 13th international conference on Supercomputing
Wattch: a framework for architectural-level power analysis and optimizations

Proceedings of the 27th annual international symposium on Computer architecture
Vector vs. superscalar and VLIW architectures for embedded multimedia benchmarks

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Orion: a power-performance simulator for interconnection networks

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Decoupled vector architectures

HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Multithreaded Vector Architectures

HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture
Overcoming the limitations of conventional vector processors

Proceedings of the 30th annual international symposium on Computer architecture
The Vector-Thread Architecture

Proceedings of the 31st annual international symposium on Computer architecture
Conjoined-Core Chip Multiprocessing

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Cache Refill/Access Decoupling for Vector Machines

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Merrimac: Supercomputing with Streams

Proceedings of the 2003 ACM/IEEE conference on Supercomputing

Converting massive TLP to DLP: a special-purpose processor for molecular orbital computations

Proceedings of the 4th international conference on Computing frontiers
Parallelizing the web browser

HotPar'09 Proceedings of the First USENIX conference on Hot topics in parallelism
Vector Extensions for Decision Support DBMS Acceleration

MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

Energy efficiency of computation is quickly becoming a key problem from the chip through the data center. This paper presents the first quantitative study of the potential energy efficiency of vector accelerators. We propose and study a vector accelerator architecture suitable for implementation in a 70nm technology. The vector architecture has a high-bandwidth on-chip cache system coupled to 16 independent memory channels. We show that such an accelerator can achieve speedups of 10X or more on loop kernels in comparison to a quad-issue superscalar uniprocessor, while using less energy. We also introduce run-ahead lanes, a complexity and energy efficient means of tolerating variable latency from crossbar contention, cache bank conflicts, cache misses, and the memory system. Run-ahead lanes only synchronize on dependencies or when explicitly directed.