CMSM: an efficient and effective code management for software managed multicores

Authors:
Ke Bai;Jing Lu;Aviral Shrivastava;Bryce Holton
Affiliations:
Arizona State University, Tempe, Arizona;Arizona State University, Tempe, Arizona;Arizona State University, Tempe, Arizona;Arizona State University, Tempe, Arizona
Venue:
Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis
Year:
2013

Citing 17
Cited 0

Reducing energy consumption by dynamic copying of instructions onto onchip memory

Proceedings of the 15th international symposium on System Synthesis
Scratchpad memory: design alternative for cache on-chip memory in embedded systems

Proceedings of the tenth international symposium on Hardware/software codesign
A study of branch prediction strategies

ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
A post-compiler approach to scratchpad mapping of code

Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
A novel instruction scratchpad memory optimization method based on concomitance metric

ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
MiBench: A free, commercially representative embedded benchmark suite

WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Dynamic allocation for scratch-pad memory using compile-time decisions

ACM Transactions on Embedded Computing Systems (TECS)
Scratchpad memory management for portable systems with a memory management unit

EMSOFT '06 Proceedings of the 6th ACM & IEEE International conference on Embedded software
SDRM: simultaneous determination of regions and function-to-region mapping for scratchpad memories

HiPC'08 Proceedings of the 15th international conference on High performance computing
A performance model and code overlay generator for scratchpad enhanced embedded processors

CODES/ISSS '10 Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Heap data management for limited local memory (LLM) multi-core processors

CODES/ISSS '10 Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Overlay techniques for scratchpad memories in low power embedded processors

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Vector class on limited local memory (LLM) multi-core processors

CASES '11 Proceedings of the 14th international conference on Compilers, architectures and synthesis for embedded systems
Stack data management for Limited Local Memory (LLM) multi-core processors

ASAP '11 Proceedings of the ASAP 2011 - 22nd IEEE International Conference on Application-specific Systems, Architectures and Processors
Automatic code overlay generation and partially redundant code fetch elimination

ACM Transactions on Architecture and Code Optimization (TACO)
Automatic and efficient heap data management for limited local memory multicore architectures

Proceedings of the Conference on Design, Automation and Test in Europe
SSDM: smart stack data management for software managed multicores (SMMs)

Proceedings of the 50th Annual Design Automation Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

As we scale the number of cores in a multicore processor, scaling the memory hierarchy is a major challenge. Software Managed Multicore (SMM) architectures are one of the promising solutions. In an SMM architecture, there are no caches, and each core has only a local scratchpad memory. If all the code and data of the task mapped to a core do not fit on its local scratchpad memory, then explicit code and data management is required. In this paper, we solve the problem of efficiently managing code on an SMM architecture. We extend the state of the art by: i) correctly calculating the code management overhead, ii) even in the presence of branches in the task, and iii) developing a heuristic CMSM (Code Mapping for Software Managed multicores) that results in efficient code management execution on the local scratchpad memory. Our experimental results collected after executing applications from MiBench suite [1] on the Cell SPEs (Cell is an SMM architecture) [2], demonstrate that correct management cost calculation and branch consideration can improve performance by 12%. Our heuristic CMSM can reduce runtime in more than 80% of the cases, and by up to 20% on our set of benchmarks.