Implementing sorting in database systems
ACM Computing Surveys (CSUR)
Compression techniques for fast external sorting
The VLDB Journal — The International Journal on Very Large Data Bases
FAST: Flash-aware external sorting for mobile database systems
Journal of Systems and Software
Algorithms and theory of computation handbook
Proceedings of the VLDB Endowment
Which sort orders are interesting?
The VLDB Journal — The International Journal on Very Large Data Bases
New algorithms for join and grouping operations
Computer Science - Research and Development
Sort-sharing-aware query processing
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
External mergesort begins with a run formation phase creating the initial sorted runs. Run formation can be done by a load-sort-store algorithm or by replacement selection. A load-sort-store algorithm repeatedly fills available memory with input records, sorts them, and writes the result to a run file. Replacement selection produces longer runs than load-sort-store algorithms and completely overlaps sorting and I/O, but it has poor locality of reference resulting in frequent cache misses and the classical algorithm works only for fixed-length records. This paper introduces batched replacement selection: a cache-conscious version of replacement selection that works also for variable-length records. The new algorithm resembles AlphaSort in the sense that it creates small in-memory runs and merges them to form the output runs. Its performance is experimentally compared with three other run formation algorithms: classical replacement selection, Quicksort, and AlphaSort. The experiments show that batched replacement selection is considerably faster than classic replacement selection. For small records (average 100 bytes), CPU time was reduced by about 50 percent and elapsed time by 47-63 percent. It was also consistently faster than Quicksort, but it did not always outperform AlphaSort. Replacement selection produces fewer runs than Quicksort and AlphaSort. The experiments confirmed that this reduces the merge time whereas the effect on the overall sort time depends on the number of disks available.