Using memory profile analysis for automatic synthesis of pointers code

Authors:
Yosi Ben-Asher;Nadav Rotem
Affiliations:
Haifa University;Haifa University
Venue:
ACM Transactions on Embedded Computing Systems (TECS)
Year:
2013

Citing 24
Cited 0

Compiler-based prefetching for recursive data structures

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Data and memory optimization techniques for embedded systems

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Cache-efficient memory layout of aggregate data structures

Proceedings of the 14th international symposium on Systems synthesis
An efficient profile-analysis framework for data-layout optimizations

POPL '02 Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Synthesis of hardware models in C with pointers and complex data structures

IEEE Transactions on Very Large Scale Integration (VLSI) Systems - System Level Design
Flow Graph Balancing for Minimizing the Required Memory Bandwidth

ISSS '96 Proceedings of the 9th international symposium on System synthesis
Memory allocation and mapping in high-level synthesis: an integrated approach

IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on low power
Exposing Memory Access Regularities Using Object-Relative Memory Profiling

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Input data reuse in compiling window operations onto reconfigurable hardware

Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Techniques for synthesizing binaries to an advanced register/memory structure

Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays
Optimized Generation of Data-Path from C Codes for FPGAs

Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Automatic pool allocation: improving performance by controlling data structure layout in the heap

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Whole execution traces and their applications

ACM Transactions on Architecture and Code Optimization (TACO)
Memory access pattern analysis and stream cache design for multimedia applications

ASP-DAC '03 Proceedings of the 2003 Asia and South Pacific Design Automation Conference
METRIC: Memory tracing via dynamic binary rewriting to identify cache inefficiencies

ACM Transactions on Programming Languages and Systems (TOPLAS)
Valgrind: a framework for heavyweight dynamic binary instrumentation

Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Forma: A framework for safe automatic array reshaping

ACM Transactions on Programming Languages and Systems (TOPLAS)
Automatic On-chip Memory Minimization for Data Reuse

FCCM '07 Proceedings of the 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
MPADS: memory-pooling-assisted data splitting

Proceedings of the 7th international symposium on Memory management
Compilation Techniques for Reconfigurable Architectures

Compilation Techniques for Reconfigurable Architectures
Compiler-directed scratchpad memory management via graph coloring

ACM Transactions on Architecture and Code Optimization (TACO)
Automatic memory partitioning and scheduling for throughput and power optimization

Proceedings of the 2009 International Conference on Computer-Aided Design
Automatic memory partitioning: increasing memory parallelism via data structure partitioning

CODES/ISSS '10 Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Pipeline vectorization

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

One of the main advantages of high-level synthesis (HLS) is the ability to synthesize circuits that can access multiple memory banks in parallel. Current HLS systems synthesize parallel memory references based on explicit array declarations in the source code. We consider the need to synthesize not only array references but also memory operations targeting pointers and dynamic data structures. This paper describes Automatic Memory Partitioning, a method for automatically synthesizing general data structures (arrays and pointers) into multiple memory banks for increased parallelism and performance. We use source code instrumentation to collect memory traces in order to detect linear memory access patterns. The memory traces are used to split data structures into disjoint memory regions and determine which segments may benefit from parallel memory access. We present an algorithm for allocating memory segments into multiple memory banks. Experiments show significant improvements in performance while conserving the number of memory banks.