The filter cache: an energy efficient memory structure
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Instruction buffering to reduce power in processors for signal processing
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special issue on low power electronics and design
A recursive algorithm for low-power memory partitioning
ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
Memory controller policies for DRAM power management
ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
Influence of Loop Optimizations on Energy Consumption of Multi-bank Memory Systems
CC '02 Proceedings of the 11th International Conference on Compiler Construction
Scratchpad memory: design alternative for cache on-chip memory in embedded systems
Proceedings of the tenth international symposium on Hardware/software codesign
Memory access scheduling and binding considering energy minimization in multi-bank memory systems
Proceedings of the 41st annual Design Automation Conference
Dynamic Filter Cache for Low Power Instruction Memory Hierarchy
DSD '04 Proceedings of the Digital System Design, EUROMICRO Systems
Optimizing the Memory Bandwidth with Loop Morphing
ASAP '04 Proceedings of the Application-Specific Systems, Architectures and Processors, 15th IEEE International Conference
Clustered Loop Buffer Organization for Low Energy VLIW Embedded Processors
IEEE Transactions on Computers
A Distributed Control Path Architecture for VLIW Processors
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Distributed loop controller architecture for multi-threading in uni-threaded VLIW processors
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Extending Multicore Architectures to Exploit Hybrid Parallelism in Single-thread Applications
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Advanced Memory Optimization Techniques for Low-Power Embedded Processors
Advanced Memory Optimization Techniques for Low-Power Embedded Processors
Compiler-directed scratch pad memory optimization for embedded multiprocessors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on the 2002 international symposium on low-power electronics and design (ISLPED)
Ultra-Low Energy Domain-Specific Instruction-Set Processors
Ultra-Low Energy Domain-Specific Instruction-Set Processors
Hi-index | 0.00 |
The use of distributed loop buffer architectures with incompatible loop-nest organisations allows the execution of incompatible loops in parallel with minimal hardware overhead. Due to this fact, the utilisation of these distributed and scalable architectures in embedded systems is a promising option to improve the energy efficiency of the instruction memory organisations that exist in these systems. This paper proposes and analyses non-overlapping and complementary implementation options for distinct partitions of the design space that is related to distributed loop buffer architectures. The high-level trade-off analysis of the proposed implementations is crucial to present the correct process design that an embedded systems designer has to follow in order to have an efficient distributed loop buffer architecture for a certain application. Results show that, with an increase of about 6.5 % in the energy consumption of the control logic that exists in the instruction memory organisation, the overall energy consumption of the instruction memory organisation can be reduced by 6 % to 22 %, when distributed loop buffer architectures with incompatible loop-nest organisations are used instead of clustered loop buffer architectures with shared loop-nest organisations architectures.