SAGE: an automatic analyzing system for a new high-performance SoC architecture-processor-in-memory

Authors:
Slo-Li Chu;Tsung-Chuan Huang
Affiliations:
Department of Information and Computer Engineering, Chang Yuan Christian University, 22 Pu-Jen, Pu-Chung Li, Chung-Li 320, Taiwan;Department of Electrical Engtneering, National Sun Yat-sen University, 70 Lien-hai Road, Kaohsiung 804, Taiwan
Venue:
Journal of Systems Architecture: the EUROMICRO Journal
Year:
2004

Citing 17
Cited 1

Automatic decomposition of scientific programs for parallel execution

POPL '87 Proceedings of the 14th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Loop distribution with arbitrary control flow

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Precise compile-time performance prediction for superscalar-based computers

PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Static dependent costs for estimating execution time

LFP '94 Proceedings of the 1994 ACM conference on LISP and functional programming
Active pages: a computation model for intelligent memory

Proceedings of the 25th annual international symposium on Computer architecture
Maps: a compiler-managed memory system for raw machines

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Adapting cache line size to application behavior

ICS '99 Proceedings of the 13th international conference on Supercomputing
Mapping irregular applications to DIVA, a PIM-based data-intensive architecture

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
A Survey of Parallel Machine Organization and Programming

ACM Computing Surveys (CSUR)
Maximizing Multiprocessor Performance with the SUIF Compiler

Computer
A Case for Intelligent RAM

IEEE Micro
Direct Rambus Technology: The New Main Memory Standard

IEEE Micro
MINT: A Front End for Efficient Simulation of Shared-Memory Multiprocessors

MASCOTS '94 Proceedings of the Second International Workshop on Modeling, Analysis, and Simulation On Computer and Telecommunication Systems
FlexRAM: Toward an Advanced Intelligent Memory System

ICCD '99 Proceedings of the 1999 IEEE International Conference on Computer Design
Evaluation of Computing in Memory Architectures for Digital Image Processing Applications

ICCD '99 Proceedings of the 1999 IEEE International Conference on Computer Design
An intelligent memory for data-parallel applications

An intelligent memory for data-parallel applications
Improving workload balance and code optimization on processor-in-memory systems

Journal of Systems and Software

Dynamic memory access management for high-performance DSP applications using high-level synthesis

IEEE Transactions on Very Large Scale Integration (VLSI) Systems

Quantified Score

Hi-index	0.01

Visualization

Abstract

Continuous improvements in semiconductor fabrication density are supporting new classes of System-on-a-Chip (SoC) architectures that combine extensive processing logic/processor with high-density memory. Such architectures are generally called Processor-in-Memory (PIM) or Intelligent Memory (I-RAM) and can support high-performance computing by reducing the performance gap between the processor and the memory. The PIM architecture combines various processors in a single system. These processors are characterized by their computation and memory-access capabilities. Therefore, a novel strategy must be developed to identify their capabilities and dispatch the most appropriate jobs to them in order to exploit them fully. Accordingly, this study presents an automatic source-to-source parallelizing system, called statement-analysis-grouping-evaluation (SAGE), to exploit the advantages of PIM architectures. Unlike conventional iteration-based parallelizing systems, SAGE adopts statement-based analyzing approaches. This study addresses the configuration of a PIM architecture with one host processor (i.e., the main processor in state-of-the-art computer systems) and one memory processor (i.e., the computing logic integrated with the memory). The strategy of the SAGE system, in which the original program is decomposed into blocks and a feasible execution schedule is produced for the host and memory processors, is investigated as well. The experimental results for real benchmarks are also discussed.