Storing a Sparse Table with 0(1) Worst Case Access Time
Journal of the ACM (JACM)
Data structures and network algorithms
Data structures and network algorithms
Hash functions for priority queues
Information and Control
The input/output complexity of sorting and related problems
Communications of the ACM
Introduction to algorithms
The human genome project and informatics
Communications of the ACM
New indices for text: PAT Trees and PAT arrays
Information retrieval
An efficient algorithm for the All Pairs Suffix-Prefix Problem
Information Processing Letters
Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
A theory of parameterized pattern matching: algorithms and applications
STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
Alphabet dependence in parameterized matching
Information Processing Letters
Journal of Computer and System Sciences
Efficient implementation of suffix trees
Software—Practice & Experience
Commercial applications of natural language processing
Communications of the ACM
Multiple matching of parameterized patterns
Theoretical Computer Science
A fully-dynamic data structure for external substring search
STOC '95 Proceedings of the twenty-seventh annual ACM symposium on Theory of computing
Trie methods for text and spatial data on secondary storage
Trie methods for text and spatial data on secondary storage
On sorting strings in external memory (extended abstract)
STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
Dynamic dictionary matching in external memory
Information and Computation
The P-range tree: a new data structure for range searching in secondary memory
Proceedings of the sixth annual ACM-SIAM symposium on Discrete algorithms
Proceedings of the sixth annual ACM-SIAM symposium on Discrete algorithms
Parameterized pattern matching by Boyer-Moore-type algorithms
Proceedings of the sixth annual ACM-SIAM symposium on Discrete algorithms
Fast string searching in secondary storage: theoretical developments and experimental results
Proceedings of the seventh annual ACM-SIAM symposium on Discrete algorithms
Efficient suffix trees on secondary storage
Proceedings of the seventh annual ACM-SIAM symposium on Discrete algorithms
Fast algorithms for sorting and searching strings
SODA '97 Proceedings of the eighth annual ACM-SIAM symposium on Discrete algorithms
ACM Transactions on Database Systems (TODS)
PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric
Journal of the ACM (JACM)
A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM (JACM)
External memory algorithms and data structures
External memory algorithms
ACM Computing Surveys (CSUR)
Multi-attribute retrieval with combined indexes
Communications of the ACM
The Art of Computer Programming Volumes 1-3 Boxed Set
The Art of Computer Programming Volumes 1-3 Boxed Set
The Design and Analysis of Computer Algorithms
The Design and Analysis of Computer Algorithms
Outline for a multi-list organized system
ACM '59 Preprints of papers presented at the 14th national meeting of the Association for Computing Machinery
Faster algorithms for the construction of parameterized suffix trees
FOCS '95 Proceedings of the 36th Annual Symposium on Foundations of Computer Science
Rapid identification of repeated patterns in strings, trees and arrays
STOC '72 Proceedings of the fourth annual ACM symposium on Theory of computing
Biologically inspired defenses against computer viruses
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
STOC '00 Proceedings of the thirty-second annual ACM symposium on Theory of computing
On effective multi-dimensional indexing for strings
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
On the sorting-complexity of suffix tree construction
Journal of the ACM (JACM)
Two-dimensional substring indexing
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
External memory algorithms and data structures: dealing with massive data
ACM Computing Surveys (CSUR)
A typed text retrieval query language for XML documents
Journal of the American Society for Information Science and Technology - XML
B-trees: bearing fruits of all kinds
ADC '02 Proceedings of the 13th Australasian database conference - Volume 5
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Database indexing for large DNA and protein sequence collections
The VLDB Journal — The International Journal on Very Large Data Bases
A Database Index to Large Biological Sequences
Proceedings of the 27th International Conference on Very Large Data Bases
Efficient Techniques for Maintaining Multidimensional Keys in Linked Data Structures
ICAL '99 Proceedings of the 26th International Colloquium on Automata, Languages and Programming
Engineering a Lightweight Suffix Array Construction Algorithm
ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
Revised Papers from the NODe 2002 Web and Database-Related Workshops on Web, Web-Services, and Database Systems
Searching large text collections
Handbook of massive data sets
An Index Structure for Pattern Similarity Searching in DNA Microarray Data
CSB '02 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Accelerating Approximate Subsequence Search on Large Protein Sequence Databases
CSB '02 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Two-dimensional substring indexing
Journal of Computer and System Sciences - Special issu on PODS 2001
Improving linear classifier for Chinese text categorization
Information Processing and Management: an International Journal
A compressed accessibility map for XML
ACM Transactions on Database Systems (TODS)
Methods for evaluating and creating data quality
Information Systems - Special issue: Data quality in cooperative information systems
LSH forest: self-tuning indexes for similarity search
WWW '05 Proceedings of the 14th international conference on World Wide Web
Journal of the ACM (JACM)
Cache-oblivious string dictionaries
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
A space-partitioning-based indexing method for multidimensional non-ordered discrete data spaces
ACM Transactions on Information Systems (TOIS)
ACM Transactions on Database Systems (TODS)
Cache-oblivious string B-trees
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Reference-based indexing of sequence databases
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
A data structure for a sequence of string accesses in external memory
ACM Transactions on Algorithms (TALG)
ACM Computing Surveys (CSUR)
Linear work suffix array construction
Journal of the ACM (JACM)
Constructing large suffix trees on a computational grid
Journal of Parallel and Distributed Computing
Indexing XML documents for XPath query processing in external memory
Data & Knowledge Engineering - Special issue: ER 2003
A taxonomy of suffix array construction algorithms
ACM Computing Surveys (CSUR)
Genome-scale disk-based suffix tree indexing
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Making deterministic signatures quickly
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Compressed accessibility map: efficient access control for XML
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
The ND-tree: a dynamic indexing technique for multidimensional non-ordered discrete data spaces
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Fast blocking of undesirable web pages on client PC by discriminating URL using neural networks
Expert Systems with Applications: An International Journal
The TS-tree: efficient time series search and retrieval
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
The SBC-tree: an index for run-length compressed sequences
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
On searching compressed string collections cache-obliviously
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Algorithms and data structures for external memory
Foundations and Trends® in Theoretical Computer Science
External Memory Algorithms for String Problems
Fundamenta Informaticae - Workshop on Combinatorial Algorithms
ICANNGA '07 Proceedings of the 8th international conference on Adaptive and Natural Computing Algorithms, Part II
On the Longest Common Parameterized Subsequence
CPM '08 Proceedings of the 19th annual symposium on Combinatorial Pattern Matching
I/O Efficient Dynamic Data Structures for Longest Prefix Queries
SWAT '08 Proceedings of the 11th Scandinavian workshop on Algorithm Theory
A new method for indexing genomes using on-disk suffix trees
Proceedings of the 17th ACM conference on Information and knowledge management
New Frontiers in Applied Data Mining
An efficient XML encoding and labeling method for query processing and updating on dynamic XML data
Journal of Systems and Software
The C-ND tree: a multidimensional index for hybrid continuous and non-ordered discrete data spaces
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
B-tries for disk-based string management
The VLDB Journal — The International Journal on Very Large Data Bases
Making deterministic signatures quickly
ACM Transactions on Algorithms (TALG)
Quality and efficiency in high dimensional nearest neighbor search
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Text Indexing, Suffix Sorting, and Data Compression: Common Problems and Techniques
CPM '09 Proceedings of the 20th Annual Symposium on Combinatorial Pattern Matching
Permuted Longest-Common-Prefix Array
CPM '09 Proceedings of the 20th Annual Symposium on Combinatorial Pattern Matching
On Entropy-Compressed Text Indexing in External Memory
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
On the longest common parameterized subsequence
Theoretical Computer Science
AS-index: a structure for string search using n-grams and algebraic signatures
Proceedings of the 18th ACM conference on Information and knowledge management
Succinct Index for Dynamic Dictionary Matching
ISAAC '09 Proceedings of the 20th International Symposium on Algorithms and Computation
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Algorithms for memory hierarchies: advanced lectures
Algorithms for memory hierarchies: advanced lectures
Distributed and paged suffix trees for large genetic databases
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Efficient and scalable indexing techniques for biological sequence data
BIRD'07 Proceedings of the 1st international conference on Bioinformatics research and development
A faster query algorithm for the text fingerprinting problem
ESA'07 Proceedings of the 15th annual European conference on Algorithms
Efficient and accurate nearest neighbor and closest pair search in high-dimensional space
ACM Transactions on Database Systems (TODS)
Performance guarantees for B-trees with different-sized atomic keys
Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
DMP-tree: A dynamic M-way prefix tree data structure for strings matching
Computers and Electrical Engineering
Compression, indexing, and retrieval for massive string data
CPM'10 Proceedings of the 21st annual conference on Combinatorial pattern matching
ISB-tree: A new indexing scheme with efficient expected behaviour
Journal of Discrete Algorithms
Data structures: time, I/Os, entropy, joules!
ESA'10 Proceedings of the 18th annual European conference on Algorithms: Part II
Fast prefix search in little space, with applications
ESA'10 Proceedings of the 18th annual European conference on Algorithms: Part I
Algorithm engineering: bridging the gap between algorithm theory and practice
Algorithm engineering: bridging the gap between algorithm theory and practice
Redesigning the string hash table, burst trie, and BST to exploit cache
Journal of Experimental Algorithmics (JEA)
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Suffix trees for inputs larger than main memory
Information Systems
Foundations and Trends in Databases
Faster query algorithms for the text fingerprinting problem
Information and Computation
A quick tour on suffix arrays and compressed suffix arrays
Theoretical Computer Science
Worst case efficient single and multiple string matching in the RAM model
IWOCA'10 Proceedings of the 21st international conference on Combinatorial algorithms
Cache-oblivious index for approximate string matching
Theoretical Computer Science
Space-efficient substring occurrence estimation
Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
On the weak prefix-search problem
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
NEFOS: rapid cache-aware range query processing with probabilistic guarantees
DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part I
Alphabet-independent compressed text indexing
ESA'11 Proceedings of the 19th European conference on Algorithms
External string sorting: faster and cache-oblivious
STACS'06 Proceedings of the 23rd Annual conference on Theoretical Aspects of Computer Science
Obtaining provably good performance from suffix trees in secondary storage
CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
Identifying information provenance in support of intelligence analysis, sharing, and protection
ISI'06 Proceedings of the 4th IEEE international conference on Intelligence and Security Informatics
Output-Sensitive autocompletion search
SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
ISB-tree: a new indexing scheme with efficient expected behaviour
ISAAC'05 Proceedings of the 16th international conference on Algorithms and Computation
Parallel construction of large suffix trees on a PC cluster
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Rank-Sensitive data structures
SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
Worst-case efficient single and multiple string matching on packed texts in the word-RAM model
Journal of Discrete Algorithms
External Memory Algorithms for String Problems
Fundamenta Informaticae - Workshop on Combinatorial Algorithms
Cache-oblivious index for approximate string matching
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
A Lempel-Ziv text index on secondary storage
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
On the weak prefix-search problem
Theoretical Computer Science
Leaplist: lessons learned in designing tm-supported range queries
Proceedings of the 2013 ACM symposium on Principles of distributed computing
External memory K-bisimulation reduction of big graphs
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.01 |
We introduce a new text-indexing data structure, the String B-Tree, that can be seen as a link between some traditional external-memory and string-matching data structures. In a short phrase, it is a combination of B-trees and Patricia tries for internal-node indices that is made more effective by adding extra pointers to speed up search and update operations. Consequently, the String B-Tree overcomes the theoretical limitations of inverted files, B-trees, prefix B-trees, suffix arrays, compacted tries and suffix trees. String B-trees have the same worst-case performance as B-trees but they manage unbounded-length strings and perform much more powerful search operations such as the ones supported by suffix trees. String B-trees are also effective in main memory (RAM model) because they improve the online suffix tree search on a dynamic set of strings. They also can be successfully applied to database indexing and software duplication.