A Computational Approach to Edge Detection
IEEE Transactions on Pattern Analysis and Machine Intelligence
Combinatorial algorithms for integrated circuit layout
Combinatorial algorithms for integrated circuit layout
High-level synthesis: introduction to chip and system design
High-level synthesis: introduction to chip and system design
Numerical recipes in C (2nd ed.): the art of scientific computing
Numerical recipes in C (2nd ed.): the art of scientific computing
Digital image processing
Synthesis of application-specific memory designs
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Architectural exploration and optimization of local memory in embedded systems
ISSS '97 Proceedings of the 10th international symposium on System synthesis
Memory size estimation for multimedia applications
Proceedings of the 6th international workshop on Hardware/software codesign
Active pages: a computation model for intelligent memory
Proceedings of the 25th annual international symposium on Computer architecture
Automatic storage management for parallel programs
Parallel Computing - Special issues on languages and compilers for parallel computers
C-based synthesis experiences with a behavior synthesizer, “cyber”
DATE '99 Proceedings of the conference on Design, automation and test in Europe
EXPRESSION: a language for architecture exploration through compiler/simulator retargetability
DATE '99 Proceedings of the conference on Design, automation and test in Europe
ACM Computing Surveys (CSUR)
Smart Memories: a modular reconfigurable architecture
Proceedings of the 27th annual international symposium on Computer architecture
High-level library mapping for memories
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Exact memory size estimation for array computations
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special issue on the 11th international symposium on system-level synthesis and design (ISSS'98)
Systematic data reuse exploration methodology for irregular access patterns
ISSS '00 Proceedings of the 13th international symposium on System synthesis
Compiler Support for Scalable and Efficient Memory Systems
IEEE Transactions on Computers
Automatic Code Mapping on an Intelligent Memory Architecture
IEEE Transactions on Computers
Dependence Analysis for Supercomputing
Dependence Analysis for Supercomputing
Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design
C++ Algorithms for Digital Signal Processing
C++ Algorithms for Digital Signal Processing
Memory Issues in Embedded Systems-on-Chip: Optimizations and Exploration
Memory Issues in Embedded Systems-on-Chip: Optimizations and Exploration
Image and Video Compression for Multimedia Engineering
Image and Video Compression for Multimedia Engineering
High Performance Compilers for Parallel Computing
High Performance Compilers for Parallel Computing
ACM Transactions on Embedded Computing Systems (TECS)
IEEE Transactions on Parallel and Distributed Systems
High-level synthesis of distributed logic-memory architectures
Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Proceedings of the 40th annual Design Automation Conference
Architectural exploration for datapaths with memory hierarchy
EDTC '95 Proceedings of the 1995 European conference on Design and Test
FlexRAM: Toward an Advanced Intelligent Memory System
ICCD '99 Proceedings of the 1999 IEEE International Conference on Computer Design
PHIDEO: a silicon compiler for high speed algorithms
EURO-DAC '91 Proceedings of the conference on European design automation
Data dependency size estimation for use in memory optimization
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Modeling and simulation in a formal design framework
Proceedings of the 6th Balkan Conference in Informatics
Hi-index | 0.00 |
Memory-intensive applications present unique chal lenges to an application-specific integrated circuit (ASIC) designer in terms of the choice of memory organization, memory size requirements, bandwidth and access latencies, etc. The high potential of single-chip distributed logic-memory architectures in addressing many of these issues has been recognized in general-purpose computing, and more recently, in ASIC design. The high-level synthesis (HLS) techniques presented in this paper are motivated by the fact that many memory-intensive applications exhibit irregular array data access patterns. Synthesis should therefore, be capable of determining a partitioned architecture wherein array data and computations may have to be heterogeaeously distributed for achieving the best performance speed-up We use a combination of clustering and min-cut style partitioning Lechniques to yield distributed architectures, based on simulation profiling while considering various factors including data access, locality, balanced workloads, inter-partition communication, etc. Our experiments with several benchmark applications show that the proposed techniques yielded two-way partitioned architectures that can achieve upto 2.1 × (average of 1.9 ×) performance speed-up over conventional HLS solutions, while achieving upto 1.5× (average of 1.4×) performance speed-up over the best homogeneous partitioning solution feasible. At the same time the reduction in the energy-delay product over conventional single-memory designs is upto 2.7× (average of 2.0 ×). A large amount of partitioning makes further system performance improvement achievable at the cost of chip area.