High-bandwidth data memory systems for superscalar processors
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Pseudo-randomly interleaved memory
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Odd memory systems: a new approach
Journal of Parallel and Distributed Computing
XOR storage schemes for frequently used data patterns
Journal of Parallel and Distributed Computing
Data caches for superscalar processors
ICS '97 Proceedings of the 11th international conference on Supercomputing
On high-bandwidth data cache design for multi-issue processors
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Microprocessor Architectures: From VLIW to Tta
Microprocessor Architectures: From VLIW to Tta
Latin Squares for Parallel Array Access
IEEE Transactions on Parallel and Distributed Systems
Multiskewing-A Novel Technique for Optimal Parallel Memory Access
IEEE Transactions on Parallel and Distributed Systems
On Design of Parallel Memory Access Schemes for Video Coding
Journal of VLSI Signal Processing Systems
The Organization and Use of Parallel Memories
IEEE Transactions on Computers
Low-power, high-performance TTA processor for 1024-point fast fourier transform
SAMOS'06 Proceedings of the 6th international conference on Embedded Computer Systems: architectures, Modeling, and Simulation
Low-Power Application-Specific Processor for FFT Computations
Journal of Signal Processing Systems
Hi-index | 0.00 |
Many of the current applications used in battery powered devices are from digital signal processing, telecommunication, and multimedia domains. These applications typically set high requirements for computational performance and often parallelism is the key solution to meet the performance requirements. In order to exploit the parallel processing units, memory should be able to feed the data path with data. This calls for a memory organization supporting parallel memory accesses. In this paper, a conflict resolving parallel data memory system for application-specific instruction-set processors is described. The memory structure is generic and reusable to support various application-specific designs. The proposed memory system does not employ any predefined access format signals for memory addressing. The proposed parallel memory system is attached to an application-specific instruction-set processor core and comparison on area, power, and critical path are shown. The experiments show that significant power savings can be obtained by exploiting the parallel memory system instead of multi-port memory.