Computer architecture: a quantitative approach
Computer architecture: a quantitative approach
Pthreads programming
Compiler-directed scratch pad memory hierarchy design and management
Proceedings of the 39th annual Design Automation Conference
Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design
Dynamic Storage Allocation: A Survey and Critical Review
IWMM '95 Proceedings of the International Workshop on Memory Management
Protected, user-level DMA for the SHRIMP network interface
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
VLSID '99 Proceedings of the 12th International Conference on VLSI Design - 'VLSI for the Information Appliance'
A Scalable High-Performance DMA Architecture for DSP Applications
ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
Compiler-decided dynamic memory allocation for scratch-pad based embedded systems
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Cache-Aware Scratchpad Allocation Algorithm
Proceedings of the conference on Design, automation and test in Europe - Volume 2
The changing usage of a mature campus-wide wireless network
Proceedings of the 10th annual international conference on Mobile computing and networking
Intra-task scenario-aware voltage scheduling
Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Hi-index | 0.00 |
Today, wireless networks are becoming increasingly ubiquitous. Usually several complex multi-threaded applications are mapped on a single embedded system and each of them is triggered by a different input stream (in accordance with the run-time behaviours of the user and the environment). This dynamicity renders the task of fully analyzing at design-time these systems very complex, if not impossible. Therefore, run-time information has to be used in order to produce an efficient design. This introduces new challenges, especially for embedded system designers using a Direct Memory Access (DMA) module, who have to know in advance the memory transfer behaviour of the whole system, in order to design and program their DMA efficiently. This is especially important in embedded systems with DRAM memories as the concurrent accesses from different processing elements can adversely affect the page-based architecture of these memory elements. Even more, the increasingly common usage of dynamic data types further complicates the problem because the exact location of data instances in the memory is unknown at design-time. In this paper we propose a system-level optimization methodology to adapt the DMA usage parameters automatically at run-time, according to online information. With our proposed optimization approach we manage to reduce the mean latency of the memory transfers by more than 18%, thus reducing the average number of cycles that processing elements or DMAs have to waste waiting for data from the main memory, while optimizing energy consumption and system responsiveness. We evaluate our approach using a set of real-life applications and real wireless dynamic streams.