Database system concepts
The connection machine
Mapping signal processing algorithms on parallel architectures
Journal of Parallel and Distributed Computing - Massively parallel computation
Introduction to algorithms
The network architecture of the Connection Machine CM-5 (extended abstract)
SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
Specification and design of embedded systems
Specification and design of embedded systems
An FPGA-based hardware accelerator for image processing
Selected papers from the Oxford 1993 international workshop on field programmable logic and applications on More FPGAs
A high-performance microarchitecture with hardware-programmable functional units
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Hitting the memory wall: implications of the obvious
ACM SIGARCH Computer Architecture News
DPGA utilization and application
Proceedings of the 1996 ACM fourth international symposium on Field-programmable gate arrays
Memory bandwidth limitations of future microprocessors
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
The SimpleScalar tool set, version 2.0
ACM SIGARCH Computer Architecture News
Optimal and near-optimal global register allocations using 0–1 integer programming
Software—Practice & Experience
MPEG Video Compression Standard
MPEG Video Compression Standard
Digital Image Processing
Parallel Computers Two: Architecture, Programming and Algorithms
Parallel Computers Two: Architecture, Programming and Algorithms
Garp: a MIPS processor with a reconfigurable coprocessor
FCCM '97 Proceedings of the 5th IEEE Symposium on FPGA-Based Custom Computing Machines
Teramac-configurable custom computing
FCCM '95 Proceedings of the IEEE Symposium on FPGA's for Custom Computing Machines
A dynamic instruction set computer
FCCM '95 Proceedings of the IEEE Symposium on FPGA's for Custom Computing Machines
Reconfigurable architectures for general-purpose computing
Reconfigurable architectures for general-purpose computing
Active disks: programming model, algorithms and evaluation
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Exploiting ILP in page-based intelligent memory
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Mapping irregular applications to DIVA, a PIM-based data-intensive architecture
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Hardware-only stream prefetching and dynamic access ordering
Proceedings of the 14th international conference on Supercomputing
An embedded DRAM architecture for large-scale spatial-lattice computations
Proceedings of the 27th annual international symposium on Computer architecture
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - System Level Design
IEEE Transactions on Computers
Automatic Code Mapping on an Intelligent Memory Architecture
IEEE Transactions on Computers
Leveraging cache coherence in active memory systems
ICS '02 Proceedings of the 16th international conference on Supercomputing
The architecture of the DIVA processing-in-memory chip
ICS '02 Proceedings of the 16th international conference on Supercomputing
Demonstrating the Scalability of a Molecular Dynamics Application on a Petaflops Computer
International Journal of Parallel Programming
A Programmable Memory Hierarchy for Prefetching Linked Data Structures
ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
In-memory Parallelism for Database Workloads
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Energy/Performance Design of Memory Hierarchies for Processor-in-Memory Chips
IMS '00 Revised Papers from the Second International Workshop on Intelligent Memory Systems
Adaptively Mapping Code in an Intelligent Memory Architecture
IMS '00 Revised Papers from the Second International Workshop on Intelligent Memory Systems
SAGE: A New Analysis and Optimization System for FlexRAM Architecture
IMS '00 Revised Papers from the Second International Workshop on Intelligent Memory Systems
FlexCache: A Framework for Flexible Compiler Generated Data Caching
IMS '00 Revised Papers from the Second International Workshop on Intelligent Memory Systems
Active Memory Clusters: Efficient Multiprocessing on Commodity Clusters
ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
Memory System Support for Irregular Applications
LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Dissecting Cyclops: a detailed analysis of a multithreaded architecture
ACM SIGARCH Computer Architecture News
High-level synthesis of distributed logic-memory architectures
Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Proceedings of the 40th annual Design Automation Conference
Programming the FlexRAM parallel intelligent memory system
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
VLSI Architecture: Past, Present, and Future
ARVLSI '99 Proceedings of the 20th Anniversary Conference on Advanced Research in VLSI
Reducing Cost and Tolerating Defects in Page-based Intelligent Memory
ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
Cache Coherence in Intelligent Memory Systems
IEEE Transactions on Computers
Architectural Support for Uniprocessor and Multiprocessor Active Memory Systems
IEEE Transactions on Computers
TRIPS: A polymorphous architecture for exploiting ILP, TLP, and DLP
ACM Transactions on Architecture and Code Optimization (TACO)
SAGE: an automatic analyzing system for a new high-performance SoC architecture-processor-in-memory
Journal of Systems Architecture: the EUROMICRO Journal
Improving workload balance and code optimization on processor-in-memory systems
Journal of Systems and Software
Characterizing a new class of threads in scientific applications for high end supercomputers
Proceedings of the 18th annual international conference on Supercomputing
Data forwarding through in-memory precomputation threads
Proceedings of the 18th annual international conference on Supercomputing
Synthesis of Heterogeneous Distributed Architectures for Memory-Intensive Applications
Proceedings of the 2003 IEEE/ACM international conference on Computer-aided design
Tolerating memory latency through push prefetching for pointer-intensive applications
ACM Transactions on Architecture and Code Optimization (TACO)
A Prototype Processing-In-Memory (PIM) Chip for the Data-Intensive Architecture (DIVA) System
Journal of VLSI Signal Processing Systems
A low cost, multithreaded processing-in-memory system
WMPI '04 Proceedings of the 3rd workshop on Memory performance issues: in conjunction with the 31st international symposium on computer architecture
Memory-side prefetching for linked data structures for processor-in-memory systems
Journal of Parallel and Distributed Computing
Distributed Data Cache Designs for Clustered VLIW Processors
IEEE Transactions on Computers
Reducing Server Data Traffic Using a Hierarchical Computation Model
IEEE Transactions on Parallel and Distributed Systems
Performance characteristics of MAUI: an intelligent memory system architecture
Proceedings of the 2005 workshop on Memory system performance
High-level synthesis using computation-unit integrated memories
Proceedings of the 2004 IEEE/ACM International conference on Computer-aided design
Intelligent memory manager: reducing cache pollution due to memory management functions
Journal of Systems Architecture: the EUROMICRO Journal
Efficient address remapping in distributed shared-memory systems
ACM Transactions on Architecture and Code Optimization (TACO)
Impulse: Memory system support for scientific applications
Scientific Programming
Proceedings of the 21st annual international conference on Supercomputing
Scalable barrier synchronisation for large-scale shared-memory multiprocessors
International Journal of High Performance Computing and Networking
Communications of the ACM - Web science
Destructive-read in embedded DRAM, impact on power consumption
Journal of Embedded Computing - Issues in embedded single-chip multicore architectures
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Languages and Compilers for Parallel Computing
A multi-streaming SIMD architecture for multimedia applications
Proceedings of the 6th ACM conference on Computing frontiers
Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation
Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation
Evaluation of OpenMP for the cyclops multithreaded architecture
WOMPAT'03 Proceedings of the OpenMP applications and tools 2003 international conference on OpenMP shared memory parallel programming
Toward to utilize the heterogeneous multiple processors of the chip multiprocessor architecture
EUC'07 Proceedings of the 2007 international conference on Embedded and ubiquitous computing
Self-aware memory: managing distributed memory in an autonomous multi-master environment
ARCS'08 Proceedings of the 21st international conference on Architecture of computing systems
A multi-streaming SIMD multimedia computing engine
Microprocessors & Microsystems
COSPIM: a program optimization system for tightly-coupled heterogeneous environments
ICCOMP'06 Proceedings of the 10th WSEAS international conference on Computers
A SIMD neural network processor for image processing
ISNN'05 Proceedings of the Second international conference on Advances in neural networks - Volume Part II
Cache write-back schemes for embedded destructive-read DRAM
ARCS'06 Proceedings of the 19th international conference on Architecture of Computing Systems
ISNN'06 Proceedings of the Third international conference on Advances in Neural Networks - Volume Part III
An energy reduction scheduling mechanism for a high-performance soc architecture
EUC'05 Proceedings of the 2005 international conference on Embedded and Ubiquitous Computing
A resistive TCAM accelerator for data-intensive computing
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
The Journal of Supercomputing
Active disk meets flash: a case for intelligent SSDs
Proceedings of the 27th international ACM conference on International conference on supercomputing
AC-DIMM: associative computing with STT-MRAM
Proceedings of the 40th Annual International Symposium on Computer Architecture
A new perspective on processing-in-memory architecture design
Proceedings of the ACM SIGPLAN Workshop on Memory Systems Performance and Correctness
Hi-index | 0.01 |
Microprocessors and memory systems suffer from a growing gap in performance. We introduce Active Pages, a computation model which addresses this gap by shifting data-intensive computations to the memory system. An Active Page consists of a page of data and a set of associated functions which can operate upon that data. We describe an implementation of Active Pages on RADram (Reconfigurable Architecture DRAM), a memory system based upon the integration of DRAM and reconfigurable logic. Results from the SimpleScalar simulator [BA97] demonstrate up to 1000X speedups on several applications using the RADram system versus conventional memory systems. We also explore the sensitivity of our results to implementations in other memory technologies.