Digital Computer Arithmetic
Video Processing and Communications
Video Processing and Communications
Computational RAM: Implementing Processors in Memory
IEEE Design & Test
A PIM-based Multiprocessor System
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
ASAP '02 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures, and Processors
Proceedings of the 30th annual international symposium on Computer architecture
IWIA '01 Proceedings of the Innovative Architecture for Future Generation High-Performance Processors and Systems (IWIA'01)
Proceedings of the 2006 international symposium on Low power electronics and design
Proceedings of the 19th ACM Great Lakes symposium on VLSI
Hierarchical reconfigurable computing arrays for efficient CGRA-based embedded systems
Proceedings of the 46th Annual Design Automation Conference
EURASIP Journal on Embedded Systems
MORA: a new coarse-grain reconfigurable array for high throughput multimedia processing
SAMOS'07 Proceedings of the 7th international conference on Embedded computer systems: architectures, modeling, and simulation
Dynamic context compression for low-power coarse-grained reconfigurable architecture
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Low power reconfiguration technique for coarse-grained reconfigurable architecture
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Hi-index | 0.00 |
Multimedia applications have become a dominant computing workload for computer systems as well as for wireless-based devices. Due to their repetitive computing and memory intensive nature, they can take effective advantage from Processor-In-Memory (PIM) technology. In this paper, a new low-power PIM-based 32-bit reconfigurable datapath optimized for multimedia applications is presented. The new circuit efficiently performs parallel arithmetic operations on either 8-, 16-, or 32-bit integer data or on 32-bit single precision floating-point data. As a result, high flexibility is provided at a very low hardware cost. When implemented using the UMC 0.18 μm 1.8 V CMOS technology, the proposed datapath exhibits a 285 MHz running frequency, dissipates just 0.12 mW/MHz and occupies a silicon area of only 107,323 μm2. When performing 2D-DCT, proposed architecture consumes 74% less power and is 28% more power efficient compared to top-of-the-line commercial TI DSP