Compiling for numa parallel machines
Compiling for numa parallel machines
Architectural exploration and optimization of local memory in embedded systems
ISSS '97 Proceedings of the 10th international symposium on System synthesis
Memory exploration for low power, embedded systems
Proceedings of the 36th annual ACM/IEEE Design Automation Conference
Dynamic management of scratch-pad memory space
Proceedings of the 38th annual Design Automation Conference
Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design
High Performance Compilers for Parallel Computing
High Performance Compilers for Parallel Computing
DSP Processors Hit the Mainstream
Computer
Efficient Utilization of Scratch-Pad Memory in Embedded Processor Applications
EDTC '97 Proceedings of the 1997 European conference on Design and Test
Polynomial-time algorithm for on-chip scratchpad memory partitioning
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Data Reuse Analysis Technique for Software-Controlled Memory Hierarchies
Proceedings of the conference on Design, automation and test in Europe - Volume 1
Data compression for improving SPM behavior
Proceedings of the 41st annual Design Automation Conference
Layer Assignment echniques for Low Energy in Multi-Layered Memory Organisations
DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
DATE '03 Proceedings of the conference on Design, Automation and Test in Europe: Designers' Forum - Volume 2
Dynamic on-chip memory management for chip multiprocessors
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
A post-compiler approach to scratchpad mapping of code
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
Performance and Area Modeling of Complete FPGA Designs in the Presence of Loop Transformations
IEEE Transactions on Computers
Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
FORAY-GEN: Automatic Generation of Affine Functions for Memory Optimizations
Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
Using data compression in an MPSoC architecture for improving performance
GLSVLSI '05 Proceedings of the 15th ACM Great Lakes symposium on VLSI
Compiling for memory emergency
LCTES '05 Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Hardware/software managed scratchpad memory for embedded system
Proceedings of the 2004 IEEE/ACM International conference on Computer-aided design
Hierarchical memory size estimation for loop fusion and loop shifting in data-dominated applications
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
A novel instruction scratchpad memory optimization method based on concomitance metric
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
A dynamic code placement technique for scratchpad memory using postpass optimization
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Scratchpad memory management for portable systems with a memory management unit
EMSOFT '06 Proceedings of the 6th ACM & IEEE International conference on Embedded software
DRDU: A data reuse analysis technique for efficient scratch-pad memory management
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Formal model of data reuse analysis for hierarchical memory organizations
Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design
Interactive presentation: A decoupled architecture of processors with scratch-pad memory hierarchy
Proceedings of the conference on Design, automation and test in Europe
Incremental hierarchical memory size estimation for steering of loop transformations
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Dynamic scratchpad memory management for code in portable systems with an MMU
ACM Transactions on Embedded Computing Systems (TECS)
Using FORAY models to enable MPSoC memory optimizations
International Journal of Parallel Programming - Special Issue on Multiprocessor-based embedded systems
A compiler approach to managing storage and memory bandwidth in configurable architectures
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Scratchpad memory management in a multitasking environment
EMSOFT '08 Proceedings of the 8th ACM international conference on Embedded software
Static analysis of processor stall cycle aggregation
CODES+ISSS '08 Proceedings of the 6th IEEE/ACM/IFIP international conference on Hardware/Software codesign and system synthesis
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
CUDA-Lite: Reducing GPU Programming Complexity
Languages and Compilers for Parallel Computing
SPM management using Markov chain based data access prediction
Proceedings of the 2008 IEEE/ACM International Conference on Computer-Aided Design
Precise Management of Scratchpad Memories for Localising Array Accesses in Scientific Codes
CC '09 Proceedings of the 18th International Conference on Compiler Construction: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009
Design and Tool Flow of Multimedia MPSoC Platforms
Journal of Signal Processing Systems
Results on leakage power management in scratchpad-based embedded systems
CSS '07 Proceedings of the Fifth IASTED International Conference on Circuits, Signals and Systems
Journal of Embedded Computing - PATMOS 2007 selected papers on low power electronics
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Partial data reuse for windowing computations: performance modeling for FPGA implementations
ARC'07 Proceedings of the 3rd international conference on Reconfigurable computing: architectures, tools and applications
A combined optimization method for tuning two-level memory hierarchy considering energy consumption
EURASIP Journal on Embedded Systems
Overlay techniques for scratchpad memories in low power embedded processors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Compiler-directed scratch pad memory optimization for embedded multiprocessors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on the 2002 international symposium on low-power electronics and design (ISLPED)
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A reuse-aware prefetching scheme for scratchpad memory
Proceedings of the 48th Design Automation Conference
HC-Sim: a fast and exact l1 cache simulator with scratchpad memory co-simulation support
CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
RISC/DSP dual core wireless soc processor focused on multimedia applications
EUC'05 Proceedings of the 2005 international conference on Embedded and Ubiquitous Computing
Combined loop transformation and hierarchy allocation for data reuse optimization
Proceedings of the International Conference on Computer-Aided Design
Extending the applicability of scalar replacement to multiple induction variables
LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
Optimizing remote accesses for offloaded kernels: application to high-level synthesis for FPGA
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
Improving the memory bandwidth utilization using loop transformations
PATMOS'05 Proceedings of the 15th international conference on Integrated Circuit and System Design: power and Timing Modeling, Optimization and Simulation
PICA: Processor Idle Cycle Aggregation for Energy-Efficient Embedded Systems
ACM Transactions on Embedded Computing Systems (TECS)
Optimizing memory hierarchy allocation with loop transformations for high-level synthesis
Proceedings of the 49th Annual Design Automation Conference
Scalable memory hierarchies for embedded manycore systems
ARC'12 Proceedings of the 8th international conference on Reconfigurable Computing: architectures, tools and applications
PATMOS'07 Proceedings of the 17th international conference on Integrated Circuit and System Design: power and timing modeling, optimization and simulation
Polyhedral-based data reuse optimization for configurable computing
Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Write activity reduction on non-volatile main memories for embedded chip multiprocessors
ACM Transactions on Embedded Computing Systems (TECS)
ACM SIGBED Review - Special Issue on the 24th Euromicro Conference on Real-Time Systems
Optimizing remote accesses for offloaded kernels: application to high-level synthesis for FPGA
Proceedings of the Conference on Design, Automation and Test in Europe
SSDM: smart stack data management for software managed multicores (SMMs)
Proceedings of the 50th Annual Design Automation Conference
Hi-index | 0.00 |
One of the primary challenges in embedded system design is designing the memory hierarchy and restructuring the application to take advantage of it. This task is particularly important for embedded image and video processing applications that make heavy use of large multi-dimensional arrays of signals and nested loops. In this paper, we show that a simple reuse vector/matrix abstraction can provide compiler with useful information in a concise form. Using this information, compiler can either adapt application to an existing memory hierarchy or can come up with a memory hierarchy. Our initial results indicate that the compiler is very successful in both optimizing code for a given memory hierarchy and designing a hierarchy with reasonable performance/size ratio.