Fast algorithms for finding nearest common ancestors
SIAM Journal on Computing
The input/output complexity of sorting and related problems
Communications of the ACM
Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
The string B-tree: a new data structure for string search in external memory and its applications
Journal of the ACM (JACM)
A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM (JACM)
External memory algorithms and data structures: dealing with massive data
ACM Computing Surveys (CSUR)
Indexing and Dictionary Matching with One Error
WADS '99 Proceedings of the 6th International Workshop on Algorithms and Data Structures
Funnel Heap - A Cache Oblivious Priority Queue
ISAAC '02 Proceedings of the 13th International Symposium on Algorithms and Computation
LATIN '00 Proceedings of the 4th Latin American Symposium on Theoretical Informatics
Approximate String-Matching over Suffix Trees
CPM '93 Proceedings of the 4th Annual Symposium on Combinatorial Pattern Matching
Range Searching Over Tree Cross Products
ESA '00 Proceedings of the 8th Annual European Symposium on Algorithms
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Dictionary matching and indexing with errors and don't cares
STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
Cache-oblivious planar orthogonal range searching and counting
SCG '05 Proceedings of the twenty-first annual symposium on Computational geometry
Cache-oblivious string dictionaries
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Cache-oblivious string B-trees
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Linear pattern matching algorithms
SWAT '73 Proceedings of the 14th Annual Symposium on Switching and Automata Theory (swat 1973)
A linear size index for approximate pattern matching
CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
Improved approximate string matching using compressed suffix data structures
ISAAC'05 Proceedings of the 16th international conference on Algorithms and Computation
Algorithms and data structures for external memory
Foundations and Trends® in Theoretical Computer Science
Fast and compact hash tables for integer keys
ACSC '09 Proceedings of the Thirty-Second Australasian Conference on Computer Science - Volume 91
Foundations and Trends in Databases
Hi-index | 0.00 |
This paper revisits the problem of indexing a text for approximate string matching. Specifically, given a text T of length n and a positive integer k, we want to construct an index of T such that for any input pattern P, we can find all its k-error matches in T efficiently. This problem is well-studied in the internal-memory setting. Here, we extend some of these recent results to external-memory solutions, which are also cache-oblivious. Our first index occupies O((n logk n)/B) disk pages and finds all k-error matches with O((|P| + occ)/B + logk n log logB n) I/Os, where B denotes the number of words in a disk page. To the best of our knowledge, this index is the first external-memory data structure that does not require Ω(|P| + occ + poly(log n)) I/Os. The second index reduces the space to O((n log n)/B) disk pages, and the I/O complexity is O((|P| + occ)/B + logk(k+1) nlog log n).