Register windows vs. register allocation
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Minimizing register usage penalty at procedure calls
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Operating system concepts (3rd ed.)
Operating system concepts (3rd ed.)
Limits of control flow on parallelism
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Sentinel scheduling: a model for compiler-controlled speculative execution
ACM Transactions on Computer Systems (TOCS)
Interprocedural may-alias analysis for pointers: beyond k-limiting
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Dynamic memory disambiguation using the memory conflict buffer
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Predictability of load/store instruction latencies
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
DAISY: dynamic compilation for 100% architectural compatibility
Proceedings of the 24th annual international symposium on Computer architecture
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
Advanced compiler design and implementation
Advanced compiler design and implementation
Load latency tolerance in dynamically scheduled processors
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Tuning the Pentium Pro Microarchitecture
IEEE Micro
Three Architectural Models for Compiler-Controlled Speculative Execution
IEEE Transactions on Computers
Delayed Exceptions - Speculative Execution of Trapping Instructions
CC '94 Proceedings of the 5th International Conference on Compiler Construction
The Alpha 21264: A 500 MHz Out-of-Order Execution Microprocessor
COMPCON '97 Proceedings of the 42nd IEEE International Computer Conference
Optimizing the performance of dynamically-linked programs
TCON'95 Proceedings of the USENIX 1995 Technical Conference Proceedings
Preference-directed graph coloring
PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
Eliminating Exception Constraints of Java Programs for IA-64
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Beating in-order stalls with "flea-flicker" two-pass pipelining
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Partial redundancy elimination for access expressions by speculative code motion
Software—Practice & Experience
A high performance Kernel-Less Operating System architecture
ACSC '05 Proceedings of the Twenty-eighth Australasian conference on Computer Science - Volume 38
Beating In-Order Stalls with "Flea-Flicker" Two-Pass Pipelining
IEEE Transactions on Computers
Hi-index | 0.00 |
Increasing demands for processor performance have outstripped the pace of process and frequency improvements, pushing designers to find ways of increasing the amount of work that can be processed in parallel. Traditional RISC architectures use hardware approaches to obtain more instruction-level parallelism, with the compiler and the operating system (OS) having only indirect visibility into the mechanisms used.The IA-64 architecture [14] was specifically designed to enable systems which create and exploit high levels of instruction-level parallelism by explicitly encoding a program's parallelism in the instruction set [25]. This paper provides a qualitative summary of the IA-64 architecture features that support control and data speculation, and register stacking. The paper focusses on the functional synergy between these architectural elements (rather than their individual performance merits), and emphasizes how they were designed for cooperation between processor hardware, compilers and the OS.