Proceedings of the 27th annual international symposium on Computer architecture
Loop Parallelization in the Polytope Model
CONCUR '93 Proceedings of the 4th International Conference on Concurrency Theory
Automatic Parallelization in the Polytope Model
The Data Parallel Programming Model: Foundations, HPF Realization, and Scientific Applications
Estimating influence of data layout optimizations on SDRAM energy consumption
Proceedings of the 2003 international symposium on Low power electronics and design
DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Lattice-Based Memory Allocation
IEEE Transactions on Computers
Predator: a predictable SDRAM memory controller
CODES+ISSS '07 Proceedings of the 5th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Automatic On-chip Memory Minimization for Data Reuse
FCCM '07 Proceedings of the 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Automatic code generation for distributed memory architectures in the polytope model
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Analytical synthesis of bandwidth-efficient SDRAM address generators
Microprocessors & Microsystems
Hi-index | 0.00 |
The efficient use of bandwidth available on an external SDRAM interface is strongly dependent on the sequence of addresses requested. On-chip memory buffers can make possible data reuse and request reordering which together ensure bandwidth on an SDRAM interface is used efficiently. This paper outlines an automated procedure for generating an application-specific memory hierarchy which exploits reuse and reordering and quantifies the impact this has on memory bandwidth over a range of representative benchmarks. Considering a range of parameterized designs, we observe up to 50x reduction in the quantity of data fetched from external memory. This, combined with reordering of the transactions, allows up to 128x reduction in the memory access time of certain memory-intensive benchmarks.