Data cache performance of supercomputer applications
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Memory bandwidth limitations of future microprocessors
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Performance of lattice QCD programs on CP-PACS
Parallel Computing - Special issue on high performance computing in lattice QCD
IEEE Micro
Software-controlled on-chip memory for high-performance and low-power computing
ACM SIGARCH Computer Architecture News
Software Controlled Reconfigurable On-Chip Memory for High Performance Computing
IMS '00 Revised Papers from the Second International Workshop on Intelligent Memory Systems
SCIMA-SMP: on-chip memory processor architecture for SMP
WMPI '04 Proceedings of the 3rd workshop on Memory performance issues: in conjunction with the 31st international symposium on computer architecture
DRIM: a low power dynamically reconfigurable instruction memory hierarchy for embedded systems
Proceedings of the conference on Design, automation and test in Europe
Empirical study for optimization of power-performance with on-chip memory
ISHPC'05/ALPS'06 Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systems
ACM Transactions on Embedded Computing Systems (TECS) - Special Issue on Design Challenges for Many-Core Processors, Special Section on ESTIMedia'13 and Regular Papers
Hi-index | 0.00 |
Processor performance has been improved due to clock acceleration and ILP extraction techniques. Performance of main memory, however, has not been improved so much. The performance gap between processor and memory will be growing further in the future. This is very serious problem in high performance computing because effective performance is limited by memory ability in most cases. In order to overcome this problem, we propose a new VLSI architecture called SCIMA, which integrates software controllable memory into a processor chip. Most of data access is regular in high performance computing. The software controllable memory is more suitable for making good use of the regularity than conventional cache. This paper presents its architecture and performance evaluation. The evaluation results reveal the superiority of SCIMA compared with conventional cache-based architecture.