Trace selection for compiling large C application programs to microcode
MICRO 21 Proceedings of the 21st annual workshop on Microprogramming and microarchitecture
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
The design of a high performance low power microprocessor
ISLPED '96 Proceedings of the 1996 international symposium on Low power electronics and design
The filter cache: an energy efficient memory structure
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Procedure placement using temporal ordering information
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Code placement techniques for cache miss rate reduction
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Code compression for embedded systems
DAC '98 Proceedings of the 35th annual Design Automation Conference
ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
Optimal spilling for CISC machines with few registers
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Memory Issues in Embedded Systems-on-Chip: Optimizations and Exploration
Memory Issues in Embedded Systems-on-Chip: Optimizations and Exploration
Tuning of loop cache architectures to programs in embedded system design
Proceedings of the 15th international symposium on System Synthesis
Reducing energy consumption by dynamic copying of instructions onto onchip memory
Proceedings of the 15th international symposium on System Synthesis
Scratchpad memory: design alternative for cache on-chip memory in embedded systems
Proceedings of the tenth international symposium on Hardware/software codesign
A 16-bit mixed-signal microsystem with integrated CMOS-MEMS clock reference
Proceedings of the 40th annual Design Automation Conference
GLS '97 Proceedings of the 7th Great Lakes Symposium on VLSI
Register allocation & spilling via graph coloring
SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction
Energy and Performance Improvements in Microprocessor Design Using a Loop Cache
ICCD '99 Proceedings of the 1999 IEEE International Conference on Computer Design
Assigning Program and Data Objects to Scratchpad for Energy Reduction
Proceedings of the conference on Design, automation and test in Europe
Compiler-decided dynamic memory allocation for scratch-pad based embedded systems
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Circuit and microarchitectural techniques for reducing cache leakage power
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Instruction buffering exploration for low energy VLIWs with instruction clusters
Proceedings of the 2004 Asia and South Pacific Design Automation Conference
Dynamic overlay of scratchpad memory for energy minimization
Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Exploiting Fixed Programs in Embedded Systems: A Loop Cache Example
IEEE Computer Architecture Letters
Compilers: Principles, Techniques, and Tools (2nd Edition)
Compilers: Principles, Techniques, and Tools (2nd Edition)
A 16-bit, low-power microsystem with monolithic MEMS-LC clocking
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
Performance analysis of binary code protection
WSC '05 Proceedings of the 37th conference on Winter simulation
Software-based instruction caching for embedded processors
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Scratchpad allocation for data aggregates in superperfect graphs
Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
DRIM: a low power dynamically reconfigurable instruction memory hierarchy for embedded systems
Proceedings of the conference on Design, automation and test in Europe
Instruction cache energy saving through compiler way-placement
Proceedings of the conference on Design, automation and test in Europe
Dynamic code footprint optimization for the IBM Cell Broadband Engine
IWMSE '09 Proceedings of the 2009 ICSE Workshop on Multicore Software Engineering
Compiler-directed scratchpad memory management via graph coloring
ACM Transactions on Architecture and Code Optimization (TACO)
Scratchpad allocation for concurrent embedded software
ACM Transactions on Programming Languages and Systems (TOPLAS)
Fine-grain dynamic instruction placement for L0 scratch-pad memory
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Scratchpad memory allocation for data aggregates via interval coloring in superperfect graphs
ACM Transactions on Embedded Computing Systems (TECS)
Overlay techniques for scratchpad memories in low power embedded processors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A dynamic instruction scratchpad memory for embedded processors managed by hardware
ARCS'11 Proceedings of the 24th international conference on Architecture of computing systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Reducing memory space consumption through dataflow analysis
Computer Languages, Systems and Structures
Power efficient instruction caches for embedded systems
SAMOS'05 Proceedings of the 5th international conference on Embedded Computer Systems: architectures, Modeling, and Simulation
Link-time optimization for power efficiency in a tagless instruction cache
CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
DLIC: Decoded loop instructions caching for energy-aware embedded processors
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 0.00 |
Modern embedded microprocessors use low power on-chip memories called scratch-pad memories to store frequently executed instructions and data. Unlike traditional caches, scratch-pad memories lack the complex tag checking and comparison logic, thereby proving to be efficient in area and power. In this work, we focus on exploiting scratch-pad memories for storing hot code segments within an application. Static placement techniques focus on placing the most frequently executed portions of programs into the scratch-pad. However, static schemes are inherently limited by not allowing the contents of the scratch-pad memory to change at run time. In a large fraction of applications, the instruction memory footprints exceed the scratch-pad memory size, thereby limiting the usefulness of the scratch-pad. We propose a compiler managed dynamic placement algorithm, wherein multiple hot code sequences, or traces, are overlapped with each other in the scratch-pad memory at different points in time during execution. Special copy instructions are provided to copy the traces into the scratch-pad memory at run-time. Using a power estimate, the compiler initially selects the most frequent traces in an application for relocation into the scratch-pad memory. Through iterative code motion and redundancy elimination, copy instructions are inserted in infrequently executed regions of the code. For a 64-byte code cache, the compiler managed dynamic placement achieves an average of 64% energy improvement over the static solution in a low-power embedded microcontroller.