The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
A dynamic memory management unit for embedded real-time system-on-a-chip
CASES '00 Proceedings of the 2000 international conference on Compilers, architecture, and synthesis for embedded systems
Automatic data migration for reducing energy consumption in multi-bank memory systems
Proceedings of the 39th annual Design Automation Conference
Exploiting shared scratch pad memory space in embedded multiprocessor systems
Proceedings of the 39th annual Design Automation Conference
A High-Performance Memory Allocator for Object-Oriented Systems
IEEE Transactions on Computers
Dynamic Storage Allocation: A Survey and Critical Review
IWMM '95 Proceedings of the International Workshop on Memory Management
Scratchpad memory: design alternative for cache on-chip memory in embedded systems
Proceedings of the tenth international symposium on Hardware/software codesign
Hardware support for real-time embedded multiprocessor system-on-a-chip memory management
Proceedings of the tenth international symposium on Hardware/software codesign
A Hardware Implementation of Realloc Function
WVLSI '99 Proceedings of the IEEE Computer Society Workshop on VLSI'99
User-Level Dynamic Page Migration for Multiprogrammed Shared-Memory Multiprocessors
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Polynomial-time algorithm for on-chip scratchpad memory partitioning
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Compositional Memory Systems for Multimedia Communicating Tasks
Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
Performance Evaluation and Design Trade-Offs for Network-on-Chip Interconnect Architectures
IEEE Transactions on Computers
Selective code/data migration for reducing communication energy in embedded MpSoC architectures
GLSVLSI '06 Proceedings of the 16th ACM Great Lakes symposium on VLSI
Dynamic scratch-pad memory management for irregular array access patterns
Proceedings of the conference on Design, automation and test in Europe: Proceedings
A Simple Hardware Buddy System Memory Allocator
IEEE Transactions on Computers
Instruction level and operating system profiling for energy exposed software
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Memory management for embedded network applications
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Hardware/software co-synthesis with memory hierarchies
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
PM-COSYN: PE and memory co-synthesis for MPSoCs
Proceedings of the Conference on Design, Automation and Test in Europe
Variable assignment and instruction scheduling for processor with multi-module memory
Microprocessors & Microsystems
Introducing mNUMA: an extended PGAS architecture
Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model
Energy-efficient cache coherence protocol for NoC-based MPSoCs
Proceedings of the 24th symposium on Integrated circuits and systems design
Fault Resilient Real-Time Design for NoC Architectures
ICCPS '12 Proceedings of the 2012 IEEE/ACM Third International Conference on Cyber-Physical Systems
System-level synthesis of memory architecture for stream processing sub-systems of a MPSoC
Proceedings of the 49th Annual Design Automation Conference
Regional cache organization for NoC based many-core processors
Journal of Computer and System Sciences
Power-aware dynamic memory management on many-core platforms utilizing DVFS
ACM Transactions on Embedded Computing Systems (TECS) - Special Section on ESTIMedia'10
Hi-index | 0.00 |
Multiprocessor system-on-chip (MP-SoC) platforms represent an emerging trend for embedded multimedia applications. To enable MP-SoC platforms, scalable communication-centric interconnect fabrics, such as networks-on-chip (NoCs), have been recently proposed. The shared memory represents one of the key elements in designing MP-SoCs to provide data exchange and synchronization support. This paper focuses on the energy/delay exploration of a distributed shared memory architecture, suitable for low-power on-chip multiprocessors based on NoC. A mechanism is proposed for the data allocation on the distributed shared memory space, dynamically managed by an on-chip hardware memory management unit (HwMMU). Moreover, the exploitation of the HwMMU primitives for the migration, replication, and compaction of shared data is discussed. Experimental results show the impact of different distributed shared memory configurations for a selected set of parallel benchmark applications from the power/-performance perspective. Furthermore, a case study for a graph exploration algorithm is discussed, accounting for the effects of the core mapping and the network topology on energy and performance at the system level.