Software pipelining: an effective scheduling technique for VLIW machines
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Profile-guided automatic inline expansion for C programs
Software—Practice & Experience
Eliminating false data dependences using the Omega test
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Sentinel scheduling for VLIW and superscalar processors
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Effective compiler support for predicated execution using the hyperblock
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
The superblock: an effective technique for VLIW and superscalar compilation
The Journal of Supercomputing - Special issue on instruction-level parallelism
Dynamic memory disambiguation using the memory conflict buffer
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Global predicate analysis and its application to register allocation
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Integrated predicated and speculative execution in the IMPACT EPIC architecture
Proceedings of the 25th annual international symposium on Computer architecture
Wavefront scheduling: path based data representation and scheduling of subgraphs
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Modular interprocedural pointer analysis using access paths: design, implementation, and evaluation
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Accurate and efficient predicate analysis with binary decision diagrams
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
The Intel IA-64 Compiler Code Generator
IEEE Micro
Compiler optimization-space exploration
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Itanium 2 Processor Microarchitecture
IEEE Micro
A compiler framework for speculative analysis and optimizations
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Beating in-order stalls with "flea-flicker" two-pass pipelining
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Automatic Thread Extraction with Decoupled Software Pipelining
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
"Flea-flicker" Multipass Pipelining: An Alternative to the High-Power Out-of-Order Offense
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Beating In-Order Stalls with "Flea-Flicker" Two-Pass Pipelining
IEEE Transactions on Computers
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Profile-assisted Compiler Support for Dynamic Predication in Diverge-Merge Processors
Proceedings of the International Symposium on Code Generation and Optimization
Communication optimizations for global multi-threaded instruction scheduling
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Trimaran: an infrastructure for research in instruction-level parallelism
LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
An intermediate representation for speculative optimizations in a dynamic compiler
Proceedings of the 7th ACM workshop on Virtual machines and intermediate languages
Hi-index | 0.00 |
Explicitly-Parallel Instruction Computing (EPIC) providesarchitectural features, including predication and explicitcontrol speculation, intended to enhance the compiler'sability to expose instruction-level parallelism (ILP) incontrol-intensive programs. Aggressive structural transformationsusing these features, though described in theliterature, have not yet been fully characterized in completesystems. Using the Intel Itanium 2 microprocessor,the SPECint2000 benchmarks and the IMPACT Compilerfor IA-64, a research compiler competitive with thebest commercial compilers on the platform, we providean in situ evaluation of code generated using aggressive,EPIC-enabled techniques in a reality-constrained microarchitecture.Our work shows a 1.13 average speedup(up to 1.50) due to these compilation techniques, relativeto traditionally-optimized code at the same inlining andpointer analysis levels, and a 1.55 speedup (up to 2.30) relativeto GNU GCC, a solid traditional compiler. Detailedresults show that the structural compilation approach providesbenefits far beyond a decrease in branch mispredictionpenalties and that it both positively and negatively impactsinstruction cache performance. We also demonstratethe increasing significance of runtime effects, such as datacache and TLB, in determining end performance and theinteraction of these effects with control speculation.