Advanced compiler optimizations for supercomputers
Communications of the ACM - Special issue on parallelism
Checkpoint repair for out-of-order execution machines
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
MIPS RISC architecture
Alpha architecture reference manual
Alpha architecture reference manual
Sentinel scheduling for VLIW and superscalar processors
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Design tradeoffs for software-managed TLBs
ACM Transactions on Computer Systems (TOCS)
SUIF: an infrastructure for research on parallelizing and optimizing compilers
ACM SIGPLAN Notices
Region-based compilation: an introduction and motivation
Proceedings of the 28th annual international symposium on Microarchitecture
Out-of-order vector architectures
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Simple vector microprocessors for multimedia applications
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
A look at several memory management units, TLB-refill mechanisms, and page table organizations
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Adding a vector unit to a superscalar processor
ICS '99 Proceedings of the 13th international conference on Supercomputing
Communications of the ACM - Special issue on computer architecture
Tarantula: a vector extension to the alpha architecture
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Architecture of the VPP500 parallel supercomputer
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
IEEE Micro
Decoupled vector architectures
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Overcoming the limitations of conventional vector processors
Proceedings of the 30th annual international symposium on Computer architecture
Computer Architecture: A Quantitative Approach
Computer Architecture: A Quantitative Approach
Vector microprocessors
Scalable vector media-processors for embedded systems
Scalable vector media-processors for embedded systems
The Vector-Thread Architecture
Proceedings of the 31st annual international symposium on Computer architecture
Exploiting Vector Parallelism in Software Pipelined Loops
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Supporting virtual memory in GPGPU without supporting precise exceptions
Proceedings of the 2012 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness
Static analysis and compiler design for idempotent processing
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
iGPU: exception support and speculative execution on GPUs
Proceedings of the 39th Annual International Symposium on Computer Architecture
Hi-index | 0.00 |
Traditional vector architectures often lack virtual memory support because it is difficult to support fast and precise exceptions for these machines. In this paper, we propose a new exception handling model for vector architectures based on software restart markers, which divide the program into idempotent regions of code. Within a region, the processor can commit instruction results to the architectural state in any order. If an exception occurs, the machine jumps immediately to the exception handler and kills ongoing instructions. To restart execution, the operating system has just to begin execution at the start of the region. This approach avoids the area and energy overhead to buffer uncommitted vector unit state that would otherwise be required with a high-performance precise exception mechanism, but still provides a simple exception handling interface for the operating system. Our scheme also removes the requirement of preserving vector register file contents in the event of a context switch. We show that using our approach causes an average performance reduction of less than 3% across a variety of benchmarks compared with a vector machine that does not support virtual memory.