MultiMaKe: Chip-multiprocessor driven memory-aware kernel pipelining

  • Authors:
  • Luis Angel D. Bathen;Yongjin Ahn;Sudeep Pasricha;Nikil D. Dutt

  • Affiliations:
  • University of California, Irvine, CA;University of California, Irvine, CA;Colorado State University, Fort Collins;University of California, Irvine, CA

  • Venue:
  • ACM Transactions on Embedded Computing Systems (TECS) - Special section on ESTIMedia'12, LCTES'11, rigorous embedded systems design, and multiprocessor system-on-chip for cyber-physical systems
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The increasing demand for low-power and high-performance multimedia embedded systems has motivated the need for effective solutions to satisfy application bandwidth and latency requirements under a tight power budget. As technology scales, it is imperative that applications are optimized to take full advantage of the underlying resources and meet both power and performance requirements. We propose MultiMaKe, an application mapping design flow capable of discovering and enabling parallelism opportunities via code transformations, efficiently distributing the computational load across resources, and minimizing unnecessary data transfers. Our approach decomposes the application's tasks into smaller units of computations called kernels, which are distributed and pipelined across the different processing resources. We exploit the ideas of inter-kernel data reuse to minimize unnecessary data transfers between kernels, early execution edges to drive performance, and kernel pipelining to increase system throughput. Our experimental results on JPEG and JPEG2000 show up to 97% off-chip memory access reduction, and up to 80% execution time reduction over standard mapping and task-level pipelining approaches.