Software—Practice & Experience
Overview of the second text retrieval conference (TREC-2)
TREC-2 Proceedings of the second conference on Text retrieval conference
Splaysort: fast, versatile, practical
Software—Practice & Experience
On sorting strings in external memory (extended abstract)
STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
Journal of Experimental Algorithmics (JEA)
Results and challenges in Web search evaluation
WWW '99 Proceedings of the eighth international conference on World Wide Web
Fast algorithms for sorting and searching strings
SODA '97 Proceedings of the eighth annual ACM-SIAM symposium on Discrete algorithms
The influence of caches on the performance of sorting
SODA '97 Proceedings of the eighth annual ACM-SIAM symposium on Discrete algorithms
Improving memory performance of sorting algorithms
Journal of Experimental Algorithmics (JEA)
Analysing cache effects in distribution sorting
Journal of Experimental Algorithmics (JEA)
Burst tries: a fast, efficient data structure for string keys
ACM Transactions on Information Systems (TOIS)
Algorithms in C
Performance of data structures for small sets of strings
ACSC '02 Proceedings of the twenty-fifth Australasian conference on Computer science - Volume 4
Adapting Radix Sort to the Memory Hierarchy
Journal of Experimental Algorithmics (JEA)
Efficient trie-based sorting of large sets of strings
ACSC '03 Proceedings of the 26th Australasian computer science conference - Volume 16
Using random sampling to build approximate tries for efficient string sorting
Journal of Experimental Algorithmics (JEA)
Cache-efficient string sorting using copying
Journal of Experimental Algorithmics (JEA)
A taxonomy of suffix array construction algorithms
ACM Computing Surveys (CSUR)
HAT-trie: a cache-conscious trie-based data structure for strings
ACSC '07 Proceedings of the thirtieth Australasian conference on Computer science - Volume 62
An efficient, versatile approach to suffix sorting
Journal of Experimental Algorithmics (JEA)
Engineering Radix Sort for Strings
SPIRE '08 Proceedings of the 15th International Symposium on String Processing and Information Retrieval
Combining digital access and parallel partition for quicksort and quickselect
IWMSE '09 Proceedings of the 2009 ICSE Workshop on Multicore Software Engineering
Comparing integer data structures for 32- and 64-bit keys
Journal of Experimental Algorithmics (JEA)
Engineering burstsort: Toward fast in-place string sorting
Journal of Experimental Algorithmics (JEA)
Scalable parallel word search in multicore/multiprocessor systems
The Journal of Supercomputing
Engineering burstsort: towards fast in-place string sorting
WEA'08 Proceedings of the 7th international conference on Experimental algorithms
Comparing integer data structures for 32 and 64 bit keys
WEA'08 Proceedings of the 7th international conference on Experimental algorithms
Engineering scalable, cache and space efficient tries for strings
The VLDB Journal — The International Journal on Very Large Data Bases
Algorithms and theory of computation handbook
Cache-Conscious collision resolution in string hash tables
SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
Hi-index | 0.00 |
Ongoing changes in computer architecture are affecting the efficiency of string-sorting algorithms. The size of main memory in typical computers continues to grow but memory accesses require increasing numbers of instruction cycles, which is a problem for the most efficient of the existing string-sorting algorithms as they do not utilize cache well for large data sets. We propose a new sorting algorithm for strings, burstsort, based on dynamic construction of a compact trie in which strings are kept in buckets. It is simple, fast, and efficient. We experimentally explore key implementation options and compare burstsort to existing string-sorting algorithms on large and small sets of strings with a range of characteristics. These experiments show that, for large sets of strings, burstsort is almost twice as fast as any previous algorithm, primarily due to a lower rate of cache miss.