An optimal sorting algorithm for mesh connected computers
STOC '86 Proceedings of the eighteenth annual ACM symposium on Theory of computing
Tight bounds on the complexity of parallel sorting
IEEE Transactions on Computers
High-performance sorting on networks of workstations
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Columnsort lives! an efficient out-of-core sorting program
Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
MPI-The Complete Reference, Volume 1: The MPI Core
MPI-The Complete Reference, Volume 1: The MPI Core
Asynchronous parallel disk sorting
Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures
Relaxing the problem-size bound for out-of-core columnsort
Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures
Out-of-core computing on mesh connected computers
Journal of Parallel and Distributed Computing
Out-of-Core and Pipeline Techniques for Wavefront Algorithms
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
PDM Sorting Algorithms That Take A Small Number of Passes
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Building on a Framework: Using FG for More Flexibility and Improved Performance in Parallel Programs
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Combating I-O bottleneck using prefetching: model, algorithms, and ramifications
The Journal of Supercomputing
Algorithmic ramifications of prefetching in memory hierarchy
HiPC'06 Proceedings of the 13th international conference on High Performance Computing
Oblivious vs. distribution-based sorting: an experimental evaluation
ESA'05 Proceedings of the 13th annual European conference on Algorithms
A simple optimal randomized algorithm for sorting on the PDM
ISAAC'05 Proceedings of the 16th international conference on Algorithms and Computation
Networks beat pipelines: the design of FG 2.0
Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores
Hi-index | 0.00 |
We describe two improvements to a previous implementation of out-of-core columnsort, in which data reside on multiple disks. The first improvement replaces asynchronous I/O and communication calls by synchronous calls within a threaded framework. Experimental runs show that this improvement reduces the running time to approximately half of the running time of the previous implementation. The second improvement uses algorithmic and engineering techniques to reduce the number of passes over the data from four to three. Experimental evidence shows that this improvement yields modest performance gains. We expect that the performance gain of this second improvement increases when the relative speed of processing and communication increases with respect to disk I/O speeds. Thus, as processing and communication become faster relative to I/O, this second improvement may yield better results than it currently does.