Software caching and computation migration in Olden
Journal of Parallel and Distributed Computing - Special issue on compilation techniques for distributed memory systems
Data and memory optimization techniques for embedded systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Dynamic management of scratch-pad memory space
Proceedings of the 38th annual Design Automation Conference
FlexCache: A Framework for Flexible Compiler Generated Data Caching
IMS '00 Revised Papers from the Second International Workshop on Intelligent Memory Systems
Scratchpad memory: design alternative for cache on-chip memory in embedded systems
Proceedings of the tenth international symposium on Hardware/software codesign
Efficient Utilization of Scratch-Pad Memory in Embedded Processor Applications
EDTC '97 Proceedings of the 1997 European conference on Design and Test
Let's Study Whole-Program Cache Behaviour Analytically
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
An integrated hardware/software approach for run-time scratchpad management
Proceedings of the 41st annual Design Automation Conference
Compiler-Based Approach for Exploiting Scratch-Pad in Presence of Irregular Array Access
Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
Memory Coloring: A Compiler Approach for Scratchpad Memory Management
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
WCET Centric Data Allocation to Scratchpad Memory
RTSS '05 Proceedings of the 26th IEEE International Real-Time Systems Symposium
Hardware/software managed scratchpad memory for embedded system
Proceedings of the 2004 IEEE/ACM International conference on Computer-aided design
Data partitioning for maximal scratchpad usage
ASP-DAC '03 Proceedings of the 2003 Asia and South Pacific Design Automation Conference
An integrated scratch-pad allocator for affine and non-affine code
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Dynamic scratch-pad memory management for irregular array access patterns
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Heap data allocation to scratch-pad memory in embedded systems
Journal of Embedded Computing - Cache exploitation in embedded systems
Prefetching irregular references for software cache on cell
Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Orchestrating data transfer for the cell/B.E. processor
Proceedings of the 22nd annual international conference on Supercomputing
Hybrid access-specific software cache techniques for the cell BE architecture
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
COMIC: a coherent shared memory interface for cell be
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
SPM management using Markov chain based data access prediction
Proceedings of the 2008 IEEE/ACM International Conference on Computer-Aided Design
Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
Optimizing the use of static buffers for DMA on a CELL chip
LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
Proceedings of the 48th Design Automation Conference
A reuse-aware prefetching scheme for scratchpad memory
Proceedings of the 48th Design Automation Conference
Vector class on limited local memory (LLM) multi-core processors
CASES '11 Proceedings of the 14th international conference on Compilers, architectures and synthesis for embedded systems
SPM-Sieve: a framework for assisting data partitioning in scratch pad memory based systems
Proceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems
Hi-index | 0.00 |
Software cache refers to cache functionality emulated in software on a compiler-controlled Scratch Pad Memory (SPM). Such structures are useful when standard SPM allocation strategies cannot be used due to hard-to-analyze memory reference patterns in the source code. SPM data allocation strategies generally rely on compile-time inference of spatial and temporal reuse, with the general flow being the copying of a block/tile of array data into the SPM, followed by its processing, and finally, copying back. However, when array index functions are complicated due to conditionals, complex expressions, and dependence on run-time data, the SPM compiler has to rely on expensive DMA for individual words, leading to poor performance. Software caches (SWC) can play a crucial role in improving performance under such circumstances -- their access times are longer than those for direct SPM access, but they retain the advantages (present in hardware caches) of exploiting spatial and temporal locality discovered at run-time. We present the first automated compiler data allocation strategy that considers the presence of a software cache in SPM space, and makes decisions on which arrays should be accessed through it, at which times. Arrays could be accessed differently in different parts of a program, and our algorithm analyzes such uses and considers the possibility of selectively accessing an array through the SWC only when it is efficient, based on a cost model of the overheads involved in SPM/SWC transitions. We implemented our technique in an LLVM based framework and experimented with several applications on a Cell based machine. Our technique results in up to 82% overall performance improvement over a conventional SPM mapping algorithm and up to 27% over a typical SWC-enhanced implementation.