IEEE Transactions on Parallel and Distributed Systems
Computer Architecture and Parallel Processing
Computer Architecture and Parallel Processing
High-Bandwidth Interleaved Memories for Vector Processors - A Simulation Study
IEEE Transactions on Computers
Implementation and Evaluation of the Complex Streamed Instruction Set
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Three-dimensional memory vectorization for high bandwidth media memory systems
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Automated Synthesis of Interleaved Memory Systems for Custom Computing Machines
EUROMICRO '98 Proceedings of the 24th Conference on EUROMICRO - Volume 1
The MOLEN Polymorphic Processor
IEEE Transactions on Computers
Instruction set extensions for software defined radio on a multithreaded processor
Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems
Reconfigurable Fixed Point Dense and Sparse Matrix-Vector Multiply/Add Unit
ASAP '06 Proceedings of the IEEE 17th International Conference on Application-specific Systems, Architectures and Processors
Reconfigurable multiple operation array
SAMOS'05 Proceedings of the 5th international conference on Embedded Computer Systems: architectures, Modeling, and Simulation
Hi-index | 0.00 |
In this paper we describe an efficient data fetch circuitry for retrieving several operands from a n-bank interleaved memory system in a single machine cycle. The proposed address generation (AGEN) unit operates with a modified version of the low-order-interleaved memory access approach. Our design supports data structures with arbitrary lengths and different (odd) strides. A detailed discussion of the 32-bit AGEN design aimed at multiple-operand functional units is presented. The experimental results indicate that our AGEN is capable of producing 8 × 32-bit addresses every 6 ns for different stride cases when implemented on VIRTEX-II PRO xc2vp30-7ff1696 FPGA device using trivial hardware resources.