Simultaneous multithreading: maximizing on-chip parallelism
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Instruction fetch mechanisms for VLIW architectures with compressed encodings
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Generation of Efficient Nested Loops from Polyhedra
International Journal of Parallel Programming - Special issue on instruction-level parallelism and parallelizing compilation, part 2
Comparing power consumption of an SMT and a CMP DSP for mobile phone workloads
CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
Enhancing loop buffering of media and telecommunications applications using low-overhead predication
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Weld: A Multithreading Technique Towards Latency-Tolerant VLIW Processors
HiPC '01 Proceedings of the 8th International Conference on High Performance Computing
Synthesis of customized loop caches for core-based embedded systems
Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Scratchpad memory: design alternative for cache on-chip memory in embedded systems
Proceedings of the tenth international symposium on Hardware/software codesign
An Efficient Compiler Technique for Code Size Reduction Using Reduced Bit-Width ISAs
Proceedings of the conference on Design, automation and test in Europe
Assigning Program and Data Objects to Scratchpad for Energy Reduction
Proceedings of the conference on Design, automation and test in Europe
Optimizing the Memory Bandwidth with Loop Morphing
ASAP '04 Proceedings of the Application-Specific Systems, Architectures and Processors, 15th IEEE International Conference
Clustered Loop Buffer Organization for Low Energy VLIW Embedded Processors
IEEE Transactions on Computers
Power Breakdown Analysis for a Heterogeneous NoC Platform Running a Video Application
ASAP '05 Proceedings of the 2005 IEEE International Conference on Application-Specific Systems, Architecture Processors
Partitioning Multi-Threaded Processors with a Large Number of Threads
ISPASS '05 Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2005
Compiler-directed scratch pad memory optimization for embedded multiprocessors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on the 2002 international symposium on low-power electronics and design (ISLPED)
Systematic intermediate sequence removal for reduced memory accesses
SCOPES '07 Proceedingsof the 10th international workshop on Software & compilers for embedded systems
Reducing complexity of multiobjective design space exploration in VLIW-based embedded systems
ACM Transactions on Architecture and Code Optimization (TACO)
Playing the trade-off game: Architecture exploration using Coffeee
ACM Transactions on Design Automation of Electronic Systems (TODAES)
COFFEE: compiler framework for energy-aware exploration
HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
Survey of Low-Energy Techniques for Instruction Memory Organisations in Embedded Systems
Journal of Signal Processing Systems
Journal of Signal Processing Systems
Hi-index | 0.00 |
Reduced energy consumption is one of the most important design goals for embedded application domains like wireless, multimedia and biomedical. Instruction memory hierarchy has been proven to be one of the most power hungry parts of the system. This paper introduces an architectural enhancement for the instruction memory to reduce energy and improve performance. The proposed distributed instruction memory organization requires minimal hardware overhead and allows execution of multiple loops in parallel in a uni-processor system. This architecture enhancement can reduce the energy consumed in the instruction and data memory hierarchy by 70.01% and improve the performance by 32.89% compared to enhanced SMT based architectures.