Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
Discrete Sequence Prediction and Its Applications
Machine Learning
The power of amnesia: learning probabilistic automata with variable memory length
Machine Learning - Special issue on COLT '94
On sorting strings in external memory (extended abstract)
STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
The string B-tree: a new data structure for string search in external memory and its applications
Journal of the ACM (JACM)
Fast algorithms for sorting and searching strings
SODA '97 Proceedings of the eighth annual ACM-SIAM symposium on Discrete algorithms
Communications of the ACM
Database Management Systems
Engineering a Lightweight Suffix Array Construction Algorithm
ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
Efficient trie-based sorting of large sets of strings
ACSC '03 Proceedings of the 26th Australasian computer science conference - Volume 16
A Fast Algorithms for Making Suffix Arrays and for Burrows-Wheeler Transformation
DCC '98 Proceedings of the Conference on Data Compression
"One Size Fits All": An Idea Whose Time Has Come and Gone
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
The Lowell database research self-assessment
Communications of the ACM - Adaptive complex enterprises
Hi-index | 0.00 |
In this paper, we study the problem of sorting a large collection of strings in external memory. Based on adaptive construction of a summary data structure, called adaptive synopsis trie , we present a practical string sorting algorithm DistStrSort , which is suitable for sorting string collections of large size in external memory, and also suitable for more complex string processing problems in text and semi-structured databases such as counting, aggregation, and statistics. Case analyses of the algorithm and experiments on real datasets show the efficiency of our algorithm in realistic setting.