Compiler optimizations for improving data locality
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Improving the ratio of memory operations to floating-point operations in loops
ACM Transactions on Programming Languages and Systems (TOPLAS)
A limit study of local memory requirements using value reuse profiles
Proceedings of the 28th annual international symposium on Microarchitecture
Improving data locality with loop transformations
ACM Transactions on Programming Languages and Systems (TOPLAS)
The intrinsic bandwidth requirements of ordinary programs
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Unroll-and-jam using uniformly generated sets
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Path-sensitive value-flow analysis
POPL '98 Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
A new algorithm for scalar register promotion based on SSA form
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
The implementation and evaluation of fusion and contraction in array languages
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Load-reuse analysis: design and evaluation
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Optimized unrolling of nested loops
Proceedings of the 14th international conference on Supercomputing
Matching and searching analysis for parallel hardware implementation on FPGAs
FPGA '01 Proceedings of the 2001 ACM/SIGDA ninth international symposium on Field programmable gate arrays
A balanced code placement framework
ACM Transactions on Programming Languages and Systems (TOPLAS)
Eliminating redundancies in sum-of-product array computations
ICS '01 Proceedings of the 15th international conference on Supercomputing
Register-sensitive selection, duplication, and sequencing of instructions
ICS '01 Proceedings of the 15th international conference on Supercomputing
Loop Transformations for Architectures with Partitioned Register Banks
OM '01 Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems
Partial method compilation using dynamic profile information
OOPSLA '01 Proceedings of the 16th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Optimized Unrolling of Nested Loops
International Journal of Parallel Programming
Register tiling in nonrectangular iteration spaces
ACM Transactions on Programming Languages and Systems (TOPLAS)
Optimizing Loop Performance for Clustered VLIW Architectures
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Compiler-Controlled Caching in Superword Register Files for Multimedia Extension Architectures
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
An experimental evaluation of scalar replacement on scientific benchmarks
Software—Practice & Experience
Improving register allocation for subscripted variables
ACM SIGPLAN Notices - Best of PLDI 1979-1999
Partial redundancy elimination for access expressions by speculative code motion
Software—Practice & Experience
Evaluating heuristics in automatically mapping multi-loop applications to FPGAs
Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays
Optimizing aggregate array computations in loops
ACM Transactions on Programming Languages and Systems (TOPLAS)
MPSoC memory optimization using program transformation
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Buffer and Register Allocation for Memory Space Optimization
Journal of VLSI Signal Processing Systems
Proceedings of the 2007 workshop on Automating service quality: Held at the International Conference on Automated Software Engineering (ASE)
Exploiting loop-dependent stream reuse for stream processors
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Multiprocessor, Multithreading and Memory Optimization for On-Chip Multimedia Applications
Journal of Signal Processing Systems
Partial data reuse for windowing computations: performance modeling for FPGA implementations
ARC'07 Proceedings of the 3rd international conference on Reconfigurable computing: architectures, tools and applications
The implementation of a coarse-grained reconfigurable architecture with loop self-pipelining
ARC'07 Proceedings of the 3rd international conference on Reconfigurable computing: architectures, tools and applications
Exploiting the reuse supplied by loop-dependent stream references for stream processors
ACM Transactions on Architecture and Code Optimization (TACO)
Applying data copy to improve memory performance of general array computations
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Integrating Memory Optimization with Mapping Algorithms for Multi-Processors System-on-Chip
ACM Transactions on Embedded Computing Systems (TECS)
Interprocedural strength reduction of critical sections in explicitly-parallel programs
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Hi-index | 0.01 |