A framework for dynamic energy efficiency and temperature management
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Automatic Code Mapping on an Intelligent Memory Architecture
IEEE Transactions on Computers
Leveraging cache coherence in active memory systems
ICS '02 Proceedings of the 16th international conference on Supercomputing
The architecture of the DIVA processing-in-memory chip
ICS '02 Proceedings of the 16th international conference on Supercomputing
Demonstrating the Scalability of a Molecular Dynamics Application on a Petaflops Computer
International Journal of Parallel Programming
A statement based parallelizing framework for processor-in-memory architectures
Information Processing Letters
A Parallel-Object Programming Model for PetaFLOPS Machines and Blue Gene/Cyclops
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Three Dimensional VLSI-Scale Interconnects
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Compile-Time Based Performance Prediction
LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
A Programmable Memory Hierarchy for Prefetching Linked Data Structures
ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
In-memory Parallelism for Database Workloads
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Energy/Performance Design of Memory Hierarchies for Processor-in-Memory Chips
IMS '00 Revised Papers from the Second International Workshop on Intelligent Memory Systems
Adaptively Mapping Code in an Intelligent Memory Architecture
IMS '00 Revised Papers from the Second International Workshop on Intelligent Memory Systems
SAGE: A New Analysis and Optimization System for FlexRAM Architecture
IMS '00 Revised Papers from the Second International Workshop on Intelligent Memory Systems
FlexCache: A Framework for Flexible Compiler Generated Data Caching
IMS '00 Revised Papers from the Second International Workshop on Intelligent Memory Systems
Active Memory Clusters: Efficient Multiprocessing on Commodity Clusters
ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
Gilgamesh: a multithreaded processor-in-memory architecture for petaflops computing
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Dissecting Cyclops: a detailed analysis of a multithreaded architecture
ACM SIGARCH Computer Architecture News
Programming the FlexRAM parallel intelligent memory system
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Reducing Cost and Tolerating Defects in Page-based Intelligent Memory
ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
Near Fine Grain Parallel Processing Using Static Scheduling on Single Chip Multiprocessors
IWIA '99 Proceedings of the 1999 International Workshop on Innovative Architecture
Architectural Support for Uniprocessor and Multiprocessor Active Memory Systems
IEEE Transactions on Computers
TRIPS: A polymorphous architecture for exploiting ILP, TLP, and DLP
ACM Transactions on Architecture and Code Optimization (TACO)
SAGE: an automatic analyzing system for a new high-performance SoC architecture-processor-in-memory
Journal of Systems Architecture: the EUROMICRO Journal
Improving workload balance and code optimization on processor-in-memory systems
Journal of Systems and Software
Data forwarding through in-memory precomputation threads
Proceedings of the 18th annual international conference on Supercomputing
Tolerating memory latency through push prefetching for pointer-intensive applications
ACM Transactions on Architecture and Code Optimization (TACO)
A Prototype Processing-In-Memory (PIM) Chip for the Data-Intensive Architecture (DIVA) System
Journal of VLSI Signal Processing Systems
Enhancing NIC Performance for MPI using Processing-in-Memory
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 9 - Volume 10
PIM lite: a multithreaded processor-in-memory prototype
GLSVLSI '05 Proceedings of the 15th ACM Great Lakes symposium on VLSI
Memory-side prefetching for linked data structures for processor-in-memory systems
Journal of Parallel and Distributed Computing
Distributed Data Cache Designs for Clustered VLIW Processors
IEEE Transactions on Computers
Reducing Server Data Traffic Using a Hierarchical Computation Model
IEEE Transactions on Parallel and Distributed Systems
MEDEA '04 Proceedings of the 2004 workshop on MEmory performance: DEaling with Applications , systems and architecture
Performance characteristics of MAUI: an intelligent memory system architecture
Proceedings of the 2005 workshop on Memory system performance
High-level synthesis using computation-unit integrated memories
Proceedings of the 2004 IEEE/ACM International conference on Computer-aided design
Journal of Embedded Computing - Embeded Processors and Systems: Architectural Issues and Solutions for Emerging Applications
Destructive-read in embedded DRAM, impact on power consumption
Journal of Embedded Computing - Issues in embedded single-chip multicore architectures
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Languages and Compilers for Parallel Computing
A multi-streaming SIMD architecture for multimedia applications
Proceedings of the 6th ACM conference on Computing frontiers
Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation
Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation
Evaluation of OpenMP for the cyclops multithreaded architecture
WOMPAT'03 Proceedings of the OpenMP applications and tools 2003 international conference on OpenMP shared memory parallel programming
Toward to utilize the heterogeneous multiple processors of the chip multiprocessor architecture
EUC'07 Proceedings of the 2007 international conference on Embedded and ubiquitous computing
Self-aware memory: managing distributed memory in an autonomous multi-master environment
ARCS'08 Proceedings of the 21st international conference on Architecture of computing systems
A multi-streaming SIMD multimedia computing engine
Microprocessors & Microsystems
COSPIM: a program optimization system for tightly-coupled heterogeneous environments
ICCOMP'06 Proceedings of the 10th WSEAS international conference on Computers
A SIMD neural network processor for image processing
ISNN'05 Proceedings of the Second international conference on Advances in neural networks - Volume Part II
ISNN'06 Proceedings of the Third international conference on Advances in Neural Networks - Volume Part III
An energy reduction scheduling mechanism for a high-performance soc architecture
EUC'05 Proceedings of the 2005 international conference on Embedded and Ubiquitous Computing
A new perspective on processing-in-memory architecture design
Proceedings of the ACM SIGPLAN Workshop on Memory Systems Performance and Correctness
Hi-index | 0.01 |
Major advances in Merged Logic DRAM (MLD) technology coupled with the popularization of memory-intensive applications provide fertile ground for architectures based on Intelligent Memory (IRAM) or Processors-in-Memory (PIM). The contribution of this paper is to explore one way to use the current state-of-the-art MLD technology for general-purpose computers. To satisfy requirements of general purpose and low programming cost, we place the PIM chips in the memory system and let them default to plain DRAM if the application is not enabled for intelligent memory. Since wide usability is crucial, we identify and analyze a range of real applications for PIM.Based on the requirements of these applications and current technological constraints, we design a PIM chip and a PIM-based memory system. We call the chip FlexRAM. We describe FlexRAM's design and floorplan, and the resulting memory system. Evaluation of the system through simulations shows that 4 FlexRAM chips often allow a workstation to run 25-40 times faster.