The input/output complexity of sorting and related problems
Communications of the ACM
Sorting Large Files on a Backend Multiprocessor
IEEE Transactions on Computers
Performance comparison of distributive and mergesort as external sorting algorithms
Journal of Systems and Software
Merging sorted runs using large main memory
Acta Informatica
Computer architecture: a quantitative approach
Computer architecture: a quantitative approach
Percentile finding algorithm for multiple sorted runs
VLDB '89 Proceedings of the 15th international conference on Very large data bases
FastSort: a distributed single-input single-output external sort
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Optimal disk I/O with parallel block transfer
STOC '90 Proceedings of the twenty-second annual ACM symposium on Theory of computing
Merging multiple lists on hierarchical-memory multiprocessors
Journal of Parallel and Distributed Computing - Special issue on shared-memory multiprocessors
The art of computer programming, volume 3: (2nd ed.) sorting and searching
The art of computer programming, volume 3: (2nd ed.) sorting and searching
Parallel Sorting Algorithms
Prefetching with Multiple Disks for External Mergesort: Simulation and Analysis
Proceedings of the Eighth International Conference on Data Engineering
Greed Sort: An Optimal External Sorting Algorithm for Multiple Disks
Greed Sort: An Optimal External Sorting Algorithm for Multiple Disks
Memory management during run generation in external sorting
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Optimal read-once parallel disk scheduling
Proceedings of the sixth workshop on I/O in parallel and distributed systems
A simple and efficient parallel disk mergesort
Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures
Buffering and Read-Ahead Strategies for External Mergesort
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Implementing sorting in database systems
ACM Computing Surveys (CSUR)
Sorting hierarchical data in external memory for archiving
Proceedings of the VLDB Endowment
Algorithms and theory of computation handbook
Proceedings of the VLDB Endowment
Sort-sharing-aware query processing
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
External mergesort is normally implemented so that each run is stored contiguously on disk and blocks of data are read exactly in the order they are needed during merging. We investigate two ideas for improving the performance of external mergesort: interleaved layout and a new reading strategy. Interleaved layout places blocks from different runs in consecutive disk addresses. This is done in the hope that interleaving will reduce seek overhead during merging. The new reading strategy precomputes the order in which data blocks are to be read according to where they are located on disk and when they are needed for merging. Extra buffer space makes it possible to read blocks in an order that reduces seek overhead, instead of reading them exactly in the order they are needed for merging. A detailed simulation model was used to compare the two layout strategies and three reading strategies. The effects of using multiple work disks were also investigated. We found that, in most cases, interleaved layout does not improve performance, but that the new reading strategy consistently performs better than double buffering and forecasting.