Global register allocation at link time
SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
An Instruction Issuing Approach to Enhancing Performance in Multiple Functional Unit Processors
IEEE Transactions on Computers
Instruction issue logic for high-performance, interruptable pipelined processors
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
The Mahler experience: using an intermediate language as the machine description
ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
The performance potential of multiple functional unit processors
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Available instruction-level parallelism for superscalar and superpipelined machines
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Instruction issue logic for pipelined supercomputers
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Reduced Instruction Set Computer Architectures for VLSI
Reduced Instruction Set Computer Architectures for VLSI
Instruction level profiling and evaluation of the IBM/6000
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Single instruction stream parallelism is greater than two
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
The effect of employing advanced branching mechanisms in superscalar processors
ACM SIGARCH Computer Architecture News
Exploiting multi-way branching to boost superscalar processor performance
ACM SIGPLAN Notices
ACM SIGARCH Computer Architecture News
On the instruction-level characteristics of scalar code in highly-vectorized scientific applications
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Performance analysis and design methodology for a scalable superscalar architecture
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
SCISM: a scalable compound instruction set machine
IBM Journal of Research and Development
Theoretical modeling of superscalar processor performance
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Performance evaluation of the PowerPC 620 microarchitecture
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Increasing superscalar performance through multistreaming
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
HLS: combining statistical and symbolic simulation to guide microprocessor designs
Proceedings of the 27th annual international symposium on Computer architecture
Overcoming the challenges to feedback-directed optimization (Keynote Talk)
DYNAMO '00 Proceedings of the ACM SIGPLAN workshop on Dynamic and adaptive compilation and optimization
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
IEEE Transactions on Computers
Exploiting Instruction-Level Parallelism for Integrated Control-Flow Monitoring
IEEE Transactions on Computers
High-Performance 3-1 Interlock Collapsing ALU's
IEEE Transactions on Computers
Limits and Graph Structure of Available Instruction-Level Parallelism (Research Note)
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
A First-Order Superscalar Processor Model
Proceedings of the 31st annual international symposium on Computer architecture
ILP in the undergraduate curriculum
WCAE '02 Proceedings of the 2002 workshop on Computer architecture education: Held in conjunction with the 29th International Symposium on Computer Architecture
Accurate critical path prediction via random trace construction
Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Proof of correctness of high-performance 3-1 interlock collapsing ALUs
IBM Journal of Research and Development
Performance analysis of multi-threaded multi-core CPUs
Proceedings of the First International Workshop on Many-core Embedded Systems
Hi-index | 14.99 |
A methodology for quickly estimating machine performance is developed. A first-order estimate is based on the average degree of machine parallelism. A second-order model corrects for the effects of nonuniformities in instruction-level and machine parallelism and is shown to be accurate to within 15% for three widely different machine pipelines: the CRAY-1, the MultiTitan, and a dual-issue superscalar machine.