The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
The influence of caches on the performance of sorting
SODA '97 Proceedings of the eighth annual ACM-SIAM symposium on Discrete algorithms
Cache performance analysis of traversals and random accesses
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
The Design and Analysis of Computer Algorithms
The Design and Analysis of Computer Algorithms
High-Performance Algorithm Engineering for Computational Phylogenetics
ICCS '01 Proceedings of the International Conference on Computational Science-Part II
Efficient Sorting Using Registers and Caches
WAE '00 Proceedings of the 4th International Workshop on Algorithm Engineering
Optimised Predecessor Data Structures for Internal Memory
WAE '01 Proceedings of the 5th International Workshop on Algorithm Engineering
Analysing the Cache Behaviour of Non-uniform Distribution Sorting Algorithms
ESA '00 Proceedings of the 8th Annual European Symposium on Algorithms
The effect of local sort on parallel sorting algorithms
EUROMICRO-PDP'02 Proceedings of the 10th Euromicro conference on Parallel, distributed and network-based processing
Hi-index | 0.00 |
We study cache effects in distribution sorting algorithms. We note that the performance of a recently-published distribution sorting algorithm, Flashsort1 which sorts n uniformly-distributed floating-point values in O(n) expected time, does not scale well with the input size n due to poor cache utilisation. We present a two-pass variant of this algorithm which outperforms the one-pass variant and comparison-based algorithms for moderate to large values of n. We present a cache analysis of these algorithms which predicts the cache miss rate of these algorithms quite well. We have also shown that the integer sorting algorithm MSB radix sort can be used very effectively on floating point data. The algorithm is very fast due to fast integer operations and relatively good cache utilisation.