Making Pointer-Based Data Structures Cache Conscious

Authors:
Trishul M. Chilimbi;Mark D. Hill;James R. Larus
Affiliations:
-;-;-
Venue:
Computer
Year:
2000

Citing 8
Cited 22

Effective “static-graph” reorganization to improve locality in garbage-collected systems

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Studies of Windows NT performance using dynamic execution traces

OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
Using generational garbage collection to implement cache-conscious data placement

Proceedings of the 1st international symposium on Memory management
Cache-conscious data placement

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Memory forwarding: enabling aggressive layout optimizations by guaranteeing the safety of data relocation

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Cache-conscious structure layout

Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Cache-conscious structure definition

Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Ubiquitous B-Tree

ACM Computing Surveys (CSUR)

Data page layouts for relational databases on deep memory hierarchies

The VLDB Journal — The International Journal on Very Large Data Bases
The set-associative cache performance of search trees

SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Scalable Large Scale Process Modeling and Simulations in Liquid Composite Molding

ICCS '01 Proceedings of the International Conference on Computational Sciences-Part I
Weaving Relations for Cache Performance

Proceedings of the 27th International Conference on Very Large Data Bases
Speculative Prefetching of Induction Pointers

CC '01 Proceedings of the 10th International Conference on Compiler Construction
Optimization and Performance of a Fortran 90 MPI-Based Unstructured Code on Large-Scale Parallel Systems

The Journal of Supercomputing
Highly accurate and efficient evaluation of randomising set index functions

Journal of Systems Architecture: the EUROMICRO Journal
Accelerating database operators using a network processor

DaMoN '05 Proceedings of the 1st international workshop on Data management on new hardware
Improving the energy behavior of block buffering using compiler optimizations

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Improving instruction cache performance in OLTP

ACM Transactions on Database Systems (TODS)
Tlink-tree: main memory index structure with concurrency control and recovery

ACST'07 Proceedings of the third conference on IASTED International Conference: Advances in Computer Science and Technology
The formalism underlying EASYMAP: A precompiler for refinement-based exploration of hierarchical data organizations

Science of Computer Programming
Traversal caches: a first step towards FPGA acceleration of pointer-based data structures

CODES+ISSS '08 Proceedings of the 6th IEEE/ACM/IFIP international conference on Hardware/Software codesign and system synthesis
Scalable parallel word search in multicore/multiprocessor systems

The Journal of Supercomputing
Algorithms for memory hierarchies: advanced lectures

Algorithms for memory hierarchies: advanced lectures
Redesigning the string hash table, burst trie, and BST to exploit cache

Journal of Experimental Algorithmics (JEA)
Memory-access-aware data structure transformations for embedded software with dynamic data accesses

IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on the 2002 international symposium on low-power electronics and design (ISLPED)
Cache index-aware memory allocation

Proceedings of the international symposium on Memory management
Reducing Network-on-Chip energy consumption through spatial locality speculation

NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
Distributed point rendering

HiPC'05 Proceedings of the 12th international conference on High Performance Computing
Performance analysis of the cache conscious-generalized search tree

ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part III
GPU computing of compressible flow problems by a meshless method with space-filling curves

Journal of Computational Physics

Quantified Score

Hi-index	4.10

Visualization

Abstract

Rapid increases in processor speed and slower increases in memory speed have produced memory access times that exceed the cost of simple, arithmetic operations. The ubiquitous hardware solution to this problem is memory caches, which exploit program locality to reduce the average latency. Other techniques use complex hardware and software to reduce or hide the high cost of memory accesses.The processor-memory gap requires a hierarchy of two or more caches between the processor and memory. The cost of finding data in this hierarchy undercuts the fundamental RAM model assumption that all memory accesses have unit cost.To narrow the widening gap between processor and memory performance, the authors propose using pointer structures to bolster performance by placing elements in a compound data structure in different memory and cache locations. This careful placement of structure elements enhances the performance of pointer-minipulating programs by improving their cache locality.