Function inlining and loop unrolling for loop acceleration in reconfigurable processors

  • Authors:
  • Narasinga Rao Miniskar;Pankaj Shailendra Gode;Soma Kohli;Donghoon Yoo

  • Affiliations:
  • Samsung India Software Operations Pvt. Ltd, Bengaluru, India;Samsung India Software Operations Pvt. Ltd, Bengaluru, India;Samsung India Software Operations Pvt. Ltd, Bengaluru, India;Samsung Electronics, Giheung, Maryland, South Korea

  • Venue:
  • Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The next generation SoCs for consumer electronics need software solutions for faster time-to-market, lower development cost and higher performance while maintaining lower energy consumption and area. As a result, reconfigurable processors (RPs) have become increasingly important, which enables just enough exibility of accepting software solutions and providing application-specific hardware reconfigurability. Samsung Electronics has developed a reconfigurable processor called Samsung Reconfigurable Processor (SRP), which is the basis of our work. Though, the SRP is a powerful processor, it requires a smart and intelligent compiler to compile the application software while exploring its reconfigurable architecture. The existing compiler for the SRP does not support functional inlining and loop unrolling, and no study has yet been done on these optimizations for the RPs. In this paper, we study the impact of these optimizations on the performance of applications for the SRP processor and we also show how these optimizations are supported in the SRP compiler. We analyze the performance improvement due to these optimizations on various benchmarks namely Sobel Edge filter, JPEG decoder, and Luma Deblocking filter of the H.264 standard. Our experimental results have shown about 83% gain on performance with the functional inlining optimization and the loop unrolling optimization when compared to the original code for Sobel filter and JPEG encoder, and 11% gain on performance for Luma Deblock filter.