An automatic object inlining optimization and its evaluation

Authors:
Julian Dolby;Andrew Chien
Affiliations:
IBM T. J. Watson Research Center, Yorktown Heights, NY;Department of Computer Science and Engineering, University of California, San Diego
Venue:
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Year:
2000

Citing 35
Cited 35

Object structure in the Emerald system

OOPLSA '86 Conference proceedings on Object-oriented programming systems, languages and applications
The annotated C++ reference manual

The annotated C++ reference manual
Data abstraction and object-oriented programming in C++

Data abstraction and object-oriented programming in C++
Interactive type analysis and extended message splitting; optimizing dynamically-typed object-oriented programs

PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
A data locality optimizing algorithm

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Efficiently computing static single assignment form and the control dependence graph

ACM Transactions on Programming Languages and Systems (TOPLAS)
Object-oriented type inference

OOPSLA '91 Conference proceedings on Object-oriented programming systems, languages, and applications
The treadmill: real-time garbage collection without motion sickness

ACM SIGPLAN Notices
Project Oberon: the design of an operating system and compiler

Project Oberon: the design of an operating system and compiler
A safe approximate algorithm for interprocedural aliasing

PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Managing interprocedural optimization

Managing interprocedural optimization
Efficient flow-sensitive interprocedural computation of pointer-induced aliases and side effects

POPL '93 Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Interprocedural may-alias analysis for pointers: beyond k-limiting

PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Optimizing dynamically-dispatched calls with run-time type feedback

PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Unrolling lists

LFP '94 Proceedings of the 1994 ACM conference on LISP and functional programming
Precise concrete type inference for object-oriented languages

OOPSLA '94 Proceedings of the ninth annual conference on Object-oriented programming systems, language, and applications
Context-insensitive alias analysis reconsidered

PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Selective specialization for object-oriented languages

PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Data and computation transformations for multiprocessors

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Thread scheduling for cache locality

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Automatic inline allocation of objects

Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Aggressive inlining

Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Escape analysis: correctness proof, implementation and experimental results

POPL '98 Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Type-based alias analysis

PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
An evaluation of automatic object inline allocation techniques

Proceedings of the 13th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
A region inference algorithm

ACM Transactions on Programming Languages and Systems (TOPLAS)
Cache-conscious structure layout

Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Cache-conscious structure definition

Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Type Directed Cloning for Object-Oriented Programs

LCPC '95 Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing
Connection Analysis: A Practical Interprocedural Heap Analysis for C

LCPC '95 Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing
Optimizing Dynamically-Typed Object-Oriented Languages With Polymorphic Inline Caches

ECOOP '91 Proceedings of the European Conference on Object-Oriented Programming
Type Inference of SELF

ECOOP '93 Proceedings of the 7th European Conference on Object-Oriented Programming
ICC++-AC++ Dialect for High Performance Parallel Computing

ISOTAS '96 Proceedings of the Second JSSST International Symposium on Object Technologies for Advanced Software
Supporting High Level Programming with High Performance: The Illinois Concert System

HIPS '97 Proceedings of the 1997 Workshop on High-Level Programming Models and Supportive Environments (HIPS '97)
Optimization of object-oriented and concurrent programs

Optimization of object-oriented and concurrent programs

Impact of economics on compiler optimization

Proceedings of the 2001 joint ACM-ISCOPE conference on Java Grande
Characterizing the memory behavior of Java workloads: a structured view and opportunities for optimizations

Proceedings of the 2001 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Source-level global optimizations for fine-grain distributed shared memory systems

PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Staged compilation

PEPM '02 Proceedings of the 2002 ACM SIGPLAN workshop on Partial evaluation and semantics-based program manipulation
An efficient profile-analysis framework for data-layout optimizations

POPL '02 Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Exploiting prolific types for memory management and optimizations

POPL '02 Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Understanding the connectivity of heap objects

Proceedings of the 3rd international symposium on Memory management
Creating and preserving locality of java applications at allocation and garbage collection times

OOPSLA '02 Proceedings of the 17th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Write barrier removal by static analysis

OOPSLA '02 Proceedings of the 17th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Object combining: A new aggressive optimization for object intensive programs

JGI '02 Proceedings of the 2002 joint ACM-ISCOPE conference on Java Grande
Run-time evaluation of opportunities for object inlining in Java

JGI '02 Proceedings of the 2002 joint ACM-ISCOPE conference on Java Grande
Immutability specification and its applications

JGI '02 Proceedings of the 2002 joint ACM-ISCOPE conference on Java Grande
Automatic program specialization for Java

ACM Transactions on Programming Languages and Systems (TOPLAS)
Connectivity-based garbage collection

OOPSLA '03 Proceedings of the 18th annual ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applications
Exploiting Object Escape and Locking Information in Partial-Order Reductions for Concurrent Object-Oriented Programs

Formal Methods in System Design
Identifying opportunities for automatic remote field cloning

CASCON '04 Proceedings of the 2004 conference of the Centre for Advanced Studies on Collaborative research
Quantifying the performance of garbage collection vs. explicit memory management

OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Automatic feedback-directed object inlining in the java hotspot™ virtual machine

Proceedings of the 3rd international conference on Virtual execution environments
Detecting data races using dynamic escape analysis based on read barrier

VM'04 Proceedings of the 3rd conference on Virtual Machine Research And Technology Symposium - Volume 3
Improving Compilation of Java Scientific Applications

International Journal of High Performance Computing Applications
Automatic array inlining in java virtual machines

Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Design of the Java HotSpot™ client compiler for Java 6

ACM Transactions on Architecture and Code Optimization (TACO)
Optimized strings for the Java HotSpot™ virtual machine

Proceedings of the 6th international symposium on Principles and practice of programming in Java
Dynamic optimization for efficient strong atomicity

Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications
Identification of logically related heap regions

Proceedings of the 2009 international symposium on Memory management
Compact and efficient strings for Java

Science of Computer Programming
Automatic feedback-directed object fusing

ACM Transactions on Architecture and Code Optimization (TACO)
Refactoring techniques for aggressive object inlining in Java applications

Automated Software Engineering
Improving shared cache behavior of multithreaded object-oriented applications in multicores

Proceedings of the International Conference on Computer-Aided Design
Class-Modular, class-escape and points-to analysis for object-oriented languages

NFM'12 Proceedings of the 4th international conference on NASA Formal Methods
Uncovering performance problems in Java applications with reference propagation profiling

Proceedings of the 34th International Conference on Software Engineering
Static detection of loop-invariant data structures

ECOOP'12 Proceedings of the 26th European conference on Object-Oriented Programming
Declarative object identity using relation types

ECOOP'07 Proceedings of the 21st European conference on Object-Oriented Programming
Optimizing array accesses in high productivity languages

HPCC'07 Proceedings of the Third international conference on High Performance Computing and Communications
A Framework for Multiplatform HPC Applications

Proceedings of Programming Models and Applications on Multicores and Manycores

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatic object inlining [19, 20] transforms heapdata structures by fusing parent and child objects together. It canimprove runtime by reducing object allocation and pointer dereferencecosts. We report continuing work studying object inlining optimizations. In particular, we present a new semantic derivation of the correctness conditions for object inlining, and program analysis which extends our previous work. And we present an object inlining transformation, focusing on a new algorithm which optimizes class field layout to minimize code expansion. Finally, we detail a fuller evaluation on eleven programs and libraries (including Xpdf, the 25,000 line Portable Document Format (PDF) file browser) that utilizes hardware measures of impact on the memory system. We show that our analysis scales effectively to large programs,finding many inlinable fields (45 in xpdf) at acceptable cost, and weshow that, on some programs, it finds nearly all fields for which object inlining is correct, and averages 40% of such fields across our benchmarks. We implement our analyses in an advanced analysis infrastructure, and we show that, compared to traditional 1-CFA, that infrastructure provides better results and lower and more scalable cost. Across all programs, analysis identified about 30% of objects as inlinable on average. Our transformation increases code size by only 20% while inlining this 30% of fields. Inlining these objects eliminated on average 28% of field reads, 58% of object creations, 12% of all loads. Further, the optimized programs have significantly improved memory reference behavior, producing 25% fewer L1 data cache misses and 25% fewer read stalls. On average the runtime improved by 14%.