SCIMA: A Novel Architecture for High Performance Computing

Authors:
Hiroshi Nakamura;Hideki Okawara;Shuichi Sakai;Taisuk Boku;Masaaki Kondo
Affiliations:
-;-;-;-;-
Venue:
IWIA '99 Proceedings of the 1999 International Workshop on Innovative Architecture
Year:
1999

Citing 10
Cited 0

The cache performance and optimizations of blocked algorithms

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Data cache performance of supercomputer applications

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Tile size selection using cache organization and data layout

PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Memory bandwidth limitations of future microprocessors

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
CP-PACS: a massively parallel processor for large scale scientific calculations

ICS '97 Proceedings of the 11th international conference on Supercomputing
Compiler-controlled memory

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Augmenting Loop Tiling with Data Alignment for Improved Cache Performance

IEEE Transactions on Computers - Special issue on cache memory and related problems
Performance of lattice QCD programs on CP-PACS

Parallel Computing - Special issue on high performance computing in lattice QCD
A Case for Intelligent RAM

IEEE Micro
Lazy Prefetching

HICSS '98 Proceedings of the Thirty-First Annual Hawaii International Conference on System Sciences-Volume 7 - Volume 7

Quantified Score

Hi-index	0.00

Visualization

Abstract

Technological trends have brought the growing disparity between processor and memory speeds. This memory wall problem is becoming very serious especially in high performance computing. In this paper, we propose a new architecture SCIMA for solving this problem. In SCIMA, addressable memory is integrated into the processor chip besides ordinary cache. Since the on-chip memory is software controllable, it has more ability to make good use of data locality than data cache, which is controlled by hardware. The purpose of on-chip memory is to reduce the off-chip memory traffic by exploiting data reusability as much as possible within a chip. We have evaluated SCIMA by using QCD simulation, a practical application in quantum field theory. The performance evaluation reveals that SCIMA successfully reduces off-chip memory traffic and achieves higher performance than cache-only processor.