Memory access coalescing: a technique for eliminating redundant memory accesses
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Olden: parallelizing programs with dynamic data structures on distributed-memory machines
Olden: parallelizing programs with dynamic data structures on distributed-memory machines
The SimpleScalar tool set, version 2.0
ACM SIGARCH Computer Architecture News
Cache-conscious data placement
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Cache-conscious structure layout
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
Bidwidth analysis with application to silicon compilation
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Exploiting superword level parallelism with multimedia instruction sets
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Frequent value compression in data caches
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Frequent value locality and value-centric data cache design
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Efficient and flexible value sampling
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Dynamically Exploiting Narrow Width Operands to Improve Processor Power and Performance
HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
Improving Cache Behavior of Dynamically Allocated Data Structures
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Bit section instruction set extension of ARM for embedded applications
CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
Bitwidth aware global register allocation
POPL '03 Proceedings of the 30th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
A Representation for Bit Section Based Analysis and Optimization
CC '02 Proceedings of the 11th International Conference on Compiler Construction
Automatic pool allocation for disjoint data structures
Proceedings of the 2002 workshop on Memory system performance
Data size optimizations for java programs
Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
Static array storage optimization in MATLAB
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Memory overflow protection for embedded systems using run-time checks, reuse and compression
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
Field level analysis for heap space optimization in embedded java environments
Proceedings of the 4th international symposium on Memory management
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Whole execution traces and their applications
ACM Transactions on Architecture and Code Optimization (TACO)
Transparent pointer compression for linked data structures
Proceedings of the 2005 workshop on Memory system performance
Memory overflow protection for embedded systems using run-time checks, reuse, and compression
ACM Transactions on Embedded Computing Systems (TECS)
No bit left behind: the limits of heap data compression
Proceedings of the 7th international symposium on Memory management
ECOOP'07 Proceedings of the 2007 conference on Object-oriented technology
Implementation, compilation, optimization of object-oriented languages, programs and systems
ECOOP'06 Proceedings of the 2006 conference on Object-oriented technology: ECOOP 2006 workshop reader
Cache aware compression for processor debug support
Proceedings of the Conference on Design, Automation and Test in Europe
Memory-access-aware data structure transformations for embedded software with dynamic data accesses
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on the 2002 international symposium on low-power electronics and design (ISLPED)
Speculative subword register allocation in embedded processors
LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
Compiler optimizations using data compression to decrease address reference entropy
LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
Object-relative addressing: compressed pointers in 64-bit java virtual machines
ECOOP'07 Proceedings of the 21st European conference on Object-Oriented Programming
Cachetor: detecting cacheable data to remove bloat
Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering
Hi-index | 0.01 |
We introduce a class of transformations which modify the representation of dynamic data structures used in programs with the objective of compressing their sizes. We have developed the commonprefix and narrow-data transformations that respectively compress a 32 bit address pointer and a 32 bit integer field into 15 bit entities. A pair of fields which have been compressed by the above compression transformations are packed together into a single 32 bit word. The above transformations are designed to apply to data structures that are partially compressible, that is, they compress portions of data structures to which transformations apply and provide a mechanism to handle the data that is not compressible. The accesses to compressed data are efficiently implemented by designing data compression extensions (DCX) to the processor's instruction set. We have observed average reductions in heap allocated storage of 25% and average reductions in execution time and power consumption of 30%. If DCX support is not provided the reductions in execution times fall from 30% to 12.5%.