Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
The cache performance and optimizations of blocked algorithms
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
1995 high level synthesis design repository
ISSS '95 Proceedings of the 8th international symposium on System synthesis
Code Generation for Embedded Processors
Code Generation for Embedded Processors
Application-Driven Architecture Synthesis
Application-Driven Architecture Synthesis
Memory Organization for Improved Data Cache Performance in Embedded Processors
ISSS '96 Proceedings of the 9th international symposium on System synthesis
Architectural exploration and optimization of local memory in embedded systems
ISSS '97 Proceedings of the 10th international symposium on System synthesis
DATE '00 Proceedings of the conference on Design, automation and test in Europe
Code placement in hardware/software co-synthesis to improve performance and reduce cost
Proceedings of the conference on Design, automation and test in Europe
Dynamic management of scratch-pad memory space
Proceedings of the 38th annual Design Automation Conference
Exploiting scratch-pad memory using Presburger formulas
Proceedings of the 14th international symposium on Systems synthesis
Exploiting shared scratch pad memory space in embedded multiprocessor systems
Proceedings of the 39th annual Design Automation Conference
Compiler-directed scratch pad memory hierarchy design and management
Proceedings of the 39th annual Design Automation Conference
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Codesign of embedded systems: status and trends
Readings in hardware/software co-design
Codesign of Embedded Systems: Status and Trends
IEEE Design & Test
ESOP '02 Proceedings of the 11th European Symposium on Programming Languages and Systems
Xtream-Fit: an energy-delay efficient data memory subsystem for embedded media processing
Proceedings of the 40th annual Design Automation Conference
ACM Transactions on Embedded Computing Systems (TECS)
Vectorizing for a SIMdD DSP architecture
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Polynomial-time algorithm for on-chip scratchpad memory partitioning
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Flexible Compiler-Managed L0 Buffers for Clustered VLIW Processors
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Data Reuse Analysis Technique for Software-Controlled Memory Hierarchies
Proceedings of the conference on Design, automation and test in Europe - Volume 1
An integrated hardware/software approach for run-time scratchpad management
Proceedings of the 41st annual Design Automation Conference
Data compression for improving SPM behavior
Proceedings of the 41st annual Design Automation Conference
On-chip Stack Based Memory Organization for Low Power Embedded Architectures
DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
A post-compiler approach to scratchpad mapping of code
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
Studying Storage-Recomputation Tradeoffs in Memory-Constrained Embedded Processing
Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
Compiler-Based Approach for Exploiting Scratch-Pad in Presence of Irregular Array Access
Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
Data space-oriented tiling for enhancing locality
ACM Transactions on Embedded Computing Systems (TECS)
Dataflow analysis for energy-efficient scratch-pad memory management
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Distributed Data Cache Designs for Clustered VLIW Processors
IEEE Transactions on Computers
CODES+ISSS '05 Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Instruction code mapping for performance increase and energy reduction in embedded computer systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Variable-Based Multi-module Data Caches for Clustered VLIW Processors
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Memory Coloring: A Compiler Approach for Scratchpad Memory Management
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Banked scratch-pad memory management for reducing leakage energy consumption
Proceedings of the 2004 IEEE/ACM International conference on Computer-aided design
Hardware/software managed scratchpad memory for embedded system
Proceedings of the 2004 IEEE/ACM International conference on Computer-aided design
Hierarchical memory size estimation for loop fusion and loop shifting in data-dominated applications
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
A novel instruction scratchpad memory optimization method based on concomitance metric
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
Maximizing data reuse for minimizing memory space requirements and execution cycles
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
Analysis of scratch-pad and data-cache performance using statistical methods
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
Data partitioning for maximal scratchpad usage
ASP-DAC '03 Proceedings of the 2003 Asia and South Pacific Design Automation Conference
Shared Scratch-Pad Memory Space Management
ISQED '06 Proceedings of the 7th International Symposium on Quality Electronic Design
Improving scratch-pad memory reliability through compiler-guided data block duplication
ICCAD '05 Proceedings of the 2005 IEEE/ACM International conference on Computer-aided design
Dynamic scratch-pad memory management for irregular array access patterns
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Reuse analysis of indirectly indexed arrays
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Multiprocessor system-on-chip data reuse analysis for exploring customized memory hierarchies
Proceedings of the 43rd annual Design Automation Conference
Demand paging for OneNAND™ Flash eXecute-in-place
CODES+ISSS '06 Proceedings of the 4th international conference on Hardware/software codesign and system synthesis
A dynamic code placement technique for scratchpad memory using postpass optimization
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Scratchpad memory management for portable systems with a memory management unit
EMSOFT '06 Proceedings of the 6th ACM & IEEE International conference on Embedded software
DRDU: A data reuse analysis technique for efficient scratch-pad memory management
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Reducing off-chip memory access via stream-conscious tiling on multimedia applications
International Journal of Parallel Programming
Compiler-Directed Variable Latency Aware SPM Management to CopeWith Timing Problems
Proceedings of the International Symposium on Code Generation and Optimization
Incremental hierarchical memory size estimation for steering of loop transformations
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Reducing off-chip memory access costs using data recomputation in embedded chip multi-processors
Proceedings of the 44th annual Design Automation Conference
Software controlled memory layout reorganization for irregular array access patterns
CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
Dynamic scratchpad memory management for code in portable systems with an MMU
ACM Transactions on Embedded Computing Systems (TECS)
Compiling for an indirect vector register architecture
Proceedings of the 5th conference on Computing frontiers
Compiler driven data layout optimization for regular/irregular array access patterns
Proceedings of the 2008 ACM SIGPLAN-SIGBED conference on Languages, compilers, and tools for embedded systems
Scratchpad memory management in a multitasking environment
EMSOFT '08 Proceedings of the 8th ACM international conference on Embedded software
CUDA-Lite: Reducing GPU Programming Complexity
Languages and Compilers for Parallel Computing
SPM management using Markov chain based data access prediction
Proceedings of the 2008 IEEE/ACM International Conference on Computer-Aided Design
HitME: low power Hit MEmory buffer for embedded systems
Proceedings of the 2009 Asia and South Pacific Design Automation Conference
Precise Management of Scratchpad Memories for Localising Array Accesses in Scientific Codes
CC '09 Proceedings of the 18th International Conference on Compiler Construction: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009
Software transactional memory for multicore embedded systems
Proceedings of the 2009 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Compiler-directed scratchpad memory management via graph coloring
ACM Transactions on Architecture and Code Optimization (TACO)
Adaptive scratch pad memory management for dynamic behavior of multimedia applications
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
PDCN '08 Proceedings of the IASTED International Conference on Parallel and Distributed Computing and Networks
A hardware/software framework for instruction and data scratchpad memory allocation
ACM Transactions on Architecture and Code Optimization (TACO)
Virtual registers: reducing register pressure without enlarging the register file
HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
Customized placement for high performance embedded processor caches
ARCS'07 Proceedings of the 20th international conference on Architecture of computing systems
A hardware architecture for dynamic performance and energy adaptation
PACS'02 Proceedings of the 2nd international conference on Power-aware computer systems
ECOOP'07 Proceedings of the 2007 conference on Object-oriented technology
Hardware/software support for adaptive work-stealing in on-chip multiprocessor
Journal of Systems Architecture: the EUROMICRO Journal
Fine-grain dynamic instruction placement for L0 scratch-pad memory
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Scratchpad memory allocation for data aggregates via interval coloring in superperfect graphs
ACM Transactions on Embedded Computing Systems (TECS)
Compiler-guided leakage optimization for banked scratch-pad memories
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Compiler-directed scratch pad memory optimization for embedded multiprocessors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on the 2002 international symposium on low-power electronics and design (ISLPED)
Design issues in composition kernels for highly functional embedded systems
Proceedings of the 2011 ACM Symposium on Applied Computing
CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Power efficient instruction caches for embedded systems
SAMOS'05 Proceedings of the 5th international conference on Embedded Computer Systems: architectures, Modeling, and Simulation
PATMOS'05 Proceedings of the 15th international conference on Integrated Circuit and System Design: power and Timing Modeling, Optimization and Simulation
Processor energy characterization for compiler-assisted software energy reduction
Journal of Electrical and Computer Engineering
Integrating software caches with scratch pad memory
Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systems
FCC-SDP: a fast close-coupled shared data pool for multi-core DSPs
ACSAC'07 Proceedings of the 12th Asia-Pacific conference on Advances in Computer Systems Architecture
MultiMaKe: Chip-multiprocessor driven memory-aware kernel pipelining
ACM Transactions on Embedded Computing Systems (TECS) - Special section on ESTIMedia'12, LCTES'11, rigorous embedded systems design, and multiprocessor system-on-chip for cyber-physical systems
Location-aware cache management for many-core processors with deep cache hierarchy
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Optimizing Data Placement of Loops for Energy Minimization with Multiple Types of Memories
Journal of Signal Processing Systems
SPM-aware scheduling for nested loops in CMP systems
ACM SIGBED Review - Special Issue on the Work-in-Progress (WiP) session of the 33rd IEEE Real-Time Systems Symposium (RTSS'12)
Embedded RAIDs-on-chip for bus-based chip-multiprocessors
ACM Transactions on Embedded Computing Systems (TECS)
SPM-Sieve: a framework for assisting data partitioning in scratch pad memory based systems
Proceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 0.00 |
Efficient utilization of on-chip memory space is extremely important in modern embedded system applications based on microprocessor cores. In addition to a data cache that interfaces with slower off-chip memory, a fast on-chip SRAM, called Scratch-Pad memory, is often used in several applications. We present a technique for efficiently exploiting on-chip Scratch-Pad memory by partitioning the application's scalar and array variables into off-chip DRAM and on-chip Scratch-Pad SRAM, with the goal of minimizing the total execution time of embedded applications. Our experiments on code kernels from typical applications show that our technique results in significant performance improvements.