Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
Strategies for cache and local memory management by global program transformation
Journal of Parallel and Distributed Computing - Special Issue on Languages, Compilers and environments for Parallel Programming
A Case for Direct-Mapped Caches
Computer
Program optimization for instruction caches
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
A data locality optimizing algorithm
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
IEEE Transactions on Computers
Compiler blockability of numerical algorithms
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Cache write policies and performance
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Computer organization & design: the hardware/software interface
Computer organization & design: the hardware/software interface
Storage assignment to decrease code size
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Unifying data and control transformations for distributed shared-memory machines
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
1995 high level synthesis design repository
ISSS '95 Proceedings of the 8th international symposium on System synthesis
Performance estimation of embedded software with instruction cache modeling
ICCAD '95 Proceedings of the 1995 IEEE/ACM international conference on Computer-aided design
Memory bank and register allocation in software synthesis for ASIPs
ICCAD '95 Proceedings of the 1995 IEEE/ACM international conference on Computer-aided design
Hardware and software mechanisms for reducing load latency
Hardware and software mechanisms for reducing load latency
Predictability of load/store instruction latencies
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Code Generation for Embedded Processors
Code Generation for Embedded Processors
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Effective Hardware-Based Data Prefetching for High-Performance Processors
IEEE Transactions on Computers
Memory organization for video algorithms on programmable signal processors
ICCD '95 Proceedings of the 1995 International Conference on Computer Design: VLSI in Computers and Processors
Reduction of Cache Coherence Overhead by Compiler Data Layout and Loop Transformation
Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
Optimal Code Placement of Embedded Software for Instruction Caches
EDTC '96 Proceedings of the 1996 European conference on Design and Test
Reducing Address Bus Transitions for Low Power Memory Mapping
EDTC '96 Proceedings of the 1996 European conference on Design and Test
Exploiting off-chip memory access modes in high-level synthesis
ICCAD '97 Proceedings of the 1997 IEEE/ACM international conference on Computer-aided design
Reducing cache misses using hardware and software page placement
ICS '99 Proceedings of the 13th international conference on Supercomputing
Data cache sizing for embedded processor applications
Proceedings of the conference on Design, automation and test in Europe
Data and memory optimization techniques for embedded systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Compiler support for block buffering
ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
Cache-efficient memory layout of aggregate data structures
Proceedings of the 14th international symposium on Systems synthesis
Data memory design and exploration for low-power embedded systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Storage allocation for embedded processors
CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
Design space optimization of embedded memory systems via data remapping
Proceedings of the joint conference on Languages, compilers and tools for embedded systems: software and compilers for embedded systems
Memory Design and Exploration for Low Power, Embedded Systems
Journal of VLSI Signal Processing Systems - Special issue on signal processing systems design and implementation
Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
Data remapping for design space optimization of embedded memory systems
ACM Transactions on Embedded Computing Systems (TECS)
Hardware support for real-time embedded multiprocessor system-on-a-chip memory management
Proceedings of the tenth international symposium on Hardware/software codesign
Proceedings of the 40th annual Design Automation Conference
Address Code and Arithmetic Optimizations for Embedded Systems
ASP-DAC '02 Proceedings of the 2002 Asia and South Pacific Design Automation Conference
Efficient spill code for SDRAM
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Memory access driven storage assignment for variables in embedded system design
Proceedings of the 2004 Asia and South Pacific Design Automation Conference
Instruction buffering exploration for low energy VLIWs with instruction clusters
Proceedings of the 2004 Asia and South Pacific Design Automation Conference
Mesh Partitioning Approach to Energy Efficient Data Layout
DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Optimizing Graph Algorithms for Improved Cache Performance
IEEE Transactions on Parallel and Distributed Systems
Instruction code mapping for performance increase and energy reduction in embedded computer systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Improving the energy behavior of block buffering using compiler optimizations
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Dynamic memory optimization using pool allocation and prefetching
ACM SIGARCH Computer Architecture News - Special issue on the 2005 workshop on binary instrumentation and application
Instruction buffering exploration for low energy embedded processors
Journal of Embedded Computing - Low-power Embedded Systems
Cache-aware optimization of BAN applications
CODES+ISSS '08 Proceedings of the 6th IEEE/ACM/IFIP international conference on Hardware/Software codesign and system synthesis
Hi-index | 0.00 |
Code generation for embedded processors opens up the possibility for several performance optimization techniques that have been ignored by traditional compilers due to compilation time constraints. We present techniques that take into account the parameters of the data caches for organizing scalar and array variables declared in embedded code into memory, with the objective of improving data cache performance. We present techniques for clustering variables to minimize compulsory cache misses, and for solving the memory assignment problem to minimize conflict cache misses. Our experiments with benchmark code kernels from DSP and other domains on the CW4001 embedded processor from LSI Logic indicate significant improvements in data cache performance by the application of our memory organization technique.