Memory partitioning and scheduling co-optimization in behavioral synthesis

  • Authors:
  • Peng Li;Yuxin Wang;Peng Zhang;Guojie Luo;Tao Wang;Jason Cong

  • Affiliations:
  • Peking University, Beijing, China;Peking University, Beijing, China;University of California, Los Angeles, CA;Peking University, Beijing, China;Peking University, Beijing, China and UCLA/PKU Joint Research Institute in Science and Engineering;Peking University, Beijing, China and University of California, Los Angeles, CA and UCLA/PKU Joint Research Institute in Science and Engineering

  • Venue:
  • Proceedings of the International Conference on Computer-Aided Design
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Achieving optimal throughput by extracting parallelism in behavioral synthesis often exaggerates memory bottleneck issues. Data partitioning is an important technique for increasing memory bandwidth by scheduling multiple simultaneous memory accesses to different memory banks. In this paper we present a vertical memory partitioning and scheduling algorithm that can generate a valid partition scheme for arbitrary affine memory inputs. It does this by arranging non-conflicting memory accesses across the border of loop iterations. A mixed memory partitioning and scheduling algorithm is also proposed to combine the advantages of the vertical and other state-of-art algorithms. A set of theorems is provided as criteria for selecting a valid partitioning scheme. This is followed by an optimal and scalable memory scheduling algorithm. By utilizing the property of constant strides between memory addresses in successive loop iterations, an address translation optimization technique for an arbitrary partition factor is proposed to improve performance, area and energy efficiency. Experimental results show that on a set of real-world medical image processing kernels, the proposed mixed algorithm with address translation optimization can gain speed-up, area reduction and power savings of 15.8%, 36% and 32.4% respectively, compared to the state-of-art memory partitioning algorithm.