Combinatorial optimization: algorithms and complexity
Combinatorial optimization: algorithms and complexity
A fast algorithm for particle simulations
Journal of Computational Physics
Solving problems on concurrent processors. Vol. 1: General techniques and regular problems
Solving problems on concurrent processors. Vol. 1: General techniques and regular problems
The art of computer programming, volume 3: (2nd ed.) sorting and searching
The art of computer programming, volume 3: (2nd ed.) sorting and searching
A 3-dimensional representation for fast rendering of complex scenes
SIGGRAPH '80 Proceedings of the 7th annual conference on Computer graphics and interactive techniques
What have we learnt from using real parallel machines to solve real problems?
C3P Proceedings of the third conference on Hypercube concurrent computers and applications - Volume 2
The design and evaluation of a shared object system for distributed memory machines
OSDI '94 Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation
Hi-index | 0.00 |
The gravitational N-body algorithm of Barnes and Hut [1] has been successfully implemented on a hypercube concurrent processor. The novel approach of their sequential algorithm has demonstrated itself to be well suited to hypercube architectures. The sequential code achieves O (NlogN) speed by recursively dividing space into subcells, thereby creating a hierarchical grouping of particles. Computing interactions between these groups dramatically reduces the amount of communication between processors, as well as the number of force calculations. Parallelism is achieved through an irregular spatial grid decomposition. Since the decomposition topology is not simple, a general loosely synchronous communication routine has been developed. Operations are simplified if the conventional grey code decomposition is modified so that the bits are taken alternately from each Cartesian dimension. A speedup of 180 has been achieved for a 500,000 particle two-dimensional calculation on 256 processors. A speedup of 65 has been obtained for a 64,000 particle three-dimensional calculation on 256 processors.