Structured computer organization (3rd ed.)
Structured computer organization (3rd ed.)
Limits of control flow on parallelism
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Increasing the instruction fetch rate via multiple branch prediction and a branch address cache
ICS '93 Proceedings of the 7th international conference on Supercomputing
A singular loop transformation framework based on non-singular matrices
International Journal of Parallel Programming
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
High Performance Compilers for Parallel Computing
High Performance Compilers for Parallel Computing
Precise Data Locality Optimization of Nested Loops
The Journal of Supercomputing
Hi-index | 0.00 |
A central topic in computer science is performance. Algorithm efficiency is pivotal in many software applications from life support to automatic teller machines. During the 1960s, the notion of efficiency was synonymous with structured programming, elimination of the goto statement, and modular coding. Forty years later, software students seldom appreciate their unique role in an algorithm's overall performance. This situation is understandable given today's very fast processors and highly efficient compilers. A wealth of research has gone into processor architecture and compiler optimization, but little thought has been given to what today's programming students need to understand about pipelining and instruction level parallelism to avoid various performance-robbing coding techniques. This paper explains how programmers can work with compilers and today's pipelined processors to enhance parallelism in their code to significantly enhance performance, without sacrificing software engineering principles such as abstraction, modularity, information hiding and other features that provide for rapid application development, reuse, and maintainability of an application.