The cache performance and optimizations of blocked algorithms
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
A data locality optimizing algorithm
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Improving the ratio of memory operations to floating-point operations in loops
ACM Transactions on Programming Languages and Systems (TOPLAS)
Tile size selection using cache organization and data layout
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Improving data locality with loop transformations
ACM Transactions on Programming Languages and Systems (TOPLAS)
C++ gems
An annotation language for optimizing software libraries
Proceedings of the 2nd conference on Domain-specific languages
Automatic loop transformations and parallelization for Java
Proceedings of the 14th international conference on Supercomputing
Optimizing compilers for modern architectures: a dependence-based approach
Optimizing compilers for modern architectures: a dependence-based approach
Optimizing Supercompilers for Supercomputers
Optimizing Supercompilers for Supercomputers
Dependence Analysis for Supercomputing
Dependence Analysis for Supercomputing
CONPAR '92/ VAPP V Proceedings of the Second Joint International Conference on Vector and Parallel Processing: Parallel Processing
A Comparison of Performance-Enhancing Strategies for Parallel Numerical Object-Oriented Frameworks
ISCOPE '97 Proceedings of the Scientific Computing in Object-Oriented Parallel Environments
Containers on the Parallelization of General-Purpose Java Programs
PACT '99 Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques
Macro Processing in Object-Oriented Languages
TOOLS '98 Proceedings of the Technology of Object-Oriented Languages and Systems
Transforming Complex Loop Nests for Locality
The Journal of Supercomputing
LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
Improving the computational intensity of unstructured mesh applications
Proceedings of the 19th annual international conference on Supercomputing
Self-adapting numerical software (SANS) effort
IBM Journal of Research and Development
Proceedings of the 22nd annual international conference on Supercomputing
Extending Automatic Parallelization to Optimize High-Level Abstractions for Multicore
IWOMP '09 Proceedings of the 5th International Workshop on OpenMP: Evolving OpenMP in an Age of Extreme Parallelism
Classification and utilization of abstractions for optimization
ISoLA'04 Proceedings of the First international conference on Leveraging Applications of Formal Methods
Applying data copy to improve memory performance of general array computations
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Effective source-to-source outlining to support whole program empirical optimization
LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
A ROSE-Based OpenMP 3.0 research compiler supporting multiple runtime libraries
IWOMP'10 Proceedings of the 6th international conference on Beyond Loop Level Parallelism in OpenMP: accelerators, Tasking and more
An extensible open-source compiler infrastructure for testing
HVC'05 Proceedings of the First Haifa international conference on Hardware and Software Verification and Testing
Automated programmable control and parameterization of compiler optimizations
CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
POET: a scripting language for applying parameterized source-to-source program transformations
Software—Practice & Experience
Hi-index | 0.00 |
Optimizing compilers have a long history of applying loop transformations to C and Fortran scientific applications. However, such optimizations are rare in compilers for object-oriented languages such as C++ or Java, where loops operating on user-defined types are left unoptimized due to their unknown semantics. Our goal is to reduce the performance penalty of using high-level object-oriented abstractions. We propose an approach that allows the explicit communication between programmers and compilers. We have extended the traditional Fortran loop optimizations with an open interface. Through this interface, we have developed techniques to automatically recognize and optimize user-defined array abstractions. In addition, we have developed an adapted constant-propagation algorithm to automatically propagate properties of abstractions. We have implemented these techniques in a C++ source-to-source translator and have applied them to optimize several kernels written using an array-class library. Our experimental results show that using our approach, applications using high-level abstractions can achieve comparable, and in cases superior, performance to that achieved by efficient low-level hand-written codes.