Introduction to Acceleration for MPI Derived Datatypes Using an Enhancer of Memory and Network
Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
The Journal of Supercomputing
A memory accelerator with gather functions for bandwidth-bound irregular applications
Proceedings of the first workshop on Irregular applications: architectures and algorithm
Hi-index | 0.00 |
This paper presents how to make inexpensive personal supercomputers getting the merit of commercial-off-the-shelf (COTS) continuously after the death of vector super-computer venders. It is designed to realize this goal without any modification on CPU, bridge chips on motherboard and memory chips. Only plugging a new memory module with vector load / store function make an inexpensive home-use personal computer into a node similar to Earth simulator's one. These nodes can be connected by COTS Infiniband 4X type or 12X type switches in order to make parallel systems. COTS SO-DIMMs on the memory modules can be accessed fastly by remote nodes by using AOTF, BOTF, RDMA and remote vector load / store operations. Applications with unit striding or indexed accesses are going to be accelerated. How to accelerate NAS CG class B is shown as an example. Used evaluation methodlogy is about 500 times faster than that of SimpleScalar based methodlogy. It is predicted with bandwidth analysis that up to 8.75 times improvement can be achieved by proposed system for a single CPU Pentium4 PC without parallel processing.