Novel design and verification of a 16 x 16-b self-repairable reconfigurable inner product processor
Proceedings of the 12th ACM Great Lakes symposium on VLSI
Computational RAM: Implementing Processors in Memory
IEEE Design & Test
Abacus: a 1024 processor 8 ns SIMD array
ARVLSI '95 Proceedings of the 16th Conference on Advanced Research in VLSI (ARVLSI'95)
Parallel-pipeline 8×8 forward 2-D ICT processor chip for image coding
IEEE Transactions on Signal Processing
An efficient architecture for two-dimensional discrete wavelet transform
IEEE Transactions on Circuits and Systems for Video Technology
New cost-effective VLSI implementation of a 2-D discrete cosine transform and its inverse
IEEE Transactions on Circuits and Systems for Video Technology
Synchronization mechanisms on modern multi-core architectures
ACSAC'07 Proceedings of the 12th Asia-Pacific conference on Advances in Computer Systems Architecture
Hi-index | 0.00 |
This paper presents the design and development of a novel, low-complexity processor-in-memory (PIM) architecture for image and video compression. By integrating a novel-processing dement with SRAM, bandwidth is improved and latency is greatly reduced. This paper also presents PIM design techniques for reduced power, area, and complexity for rapid deployment and reduced cost. A design methodology is presented and followed by an analysis of the processing element performance and capabilities. The proposed datapath solution delivers between 2 to 40 times higher performance compared to other presented solutions. The architecture executes a discrete cosine and wavelet transforms achieving up to 40% higher throughput per watt and occupying as little as 0.9% area compared to a commercial digital signal processing and other application-specified integrated circuit implementations while maintaining precision. A comprehensive comparative analysis is also provided. The proposed processor-in-memory is implemented in 1.8-V 0.18-µm CMOS technology and operates with a 300-MHz clock.