PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric
Journal of the ACM (JACM)
A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM (JACM)
Analysis and performance of inverted data base structures
Communications of the ACM
Contentaddressable Memories
The Design and Analysis of Computer Algorithms
The Design and Analysis of Computer Algorithms
Building a complete inverted file for a set of text files in linear time
STOC '84 Proceedings of the sixteenth annual ACM symposium on Theory of computing
Algorithms for string searching
ACM SIGIR Forum
Models and techniques for the visualization of labeled discrete objects
SAC '92 Proceedings of the 1992 ACM/SIGAPP symposium on Applied computing: technological challenges of the 1990's
A fully-dynamic data structure for external substring search
STOC '95 Proceedings of the twenty-seventh annual ACM symposium on Theory of computing
Fast string searching in secondary storage: theoretical developments and experimental results
Proceedings of the seventh annual ACM-SIAM symposium on Discrete algorithms
STOC '00 Proceedings of the thirty-second annual ACM symposium on Theory of computing
Monotony of surprise and large-scale quest for unusual words
Proceedings of the sixth annual international conference on Computational biology
A speed-up for the commute between subword trees and DAWGs
Information Processing Letters
A Data Structure for Circular String Analysis and Visualization
IEEE Transactions on Computers
Computing Display Conflicts in String Visualization
IEEE Transactions on Computers
A Trie Compaction Algorithm for a Large Set of Keys
IEEE Transactions on Knowledge and Data Engineering
Time-Space Trade-Off Analysis of Morphic Trie Images
IEEE Transactions on Knowledge and Data Engineering
A dynamic construction algorithm for the compact patricia trie using the hierarchical structure
Information Processing and Management: an International Journal
Database indexing for large DNA and protein sequence collections
The VLDB Journal — The International Journal on Very Large Data Bases
Space-Economical Construction of Index Structures for All Suffixes of a String
MFCS '02 Proceedings of the 27th International Symposium on Mathematical Foundations of Computer Science
Space-Efficient Data Structures for Flexible Text Retrieval Systems
ISAAC '02 Proceedings of the 13th International Symposium on Algorithms and Computation
Compact Directed Acyclic Word Graphs for a Sliding Window
SPIRE 2002 Proceedings of the 9th International Symposium on String Processing and Information Retrieval
Adaptive Algorithms for Cache-Efficient Trie Search
ALENEX '99 Selected papers from the International Workshop on Algorithm Engineering and Experimentation
Mining from Literary Texts: Pattern Discovery and Similarity Computation
Progress in Discovery Science, Final Report of the Japanese Discovery Science Project
Linear Bidirectional On-Line Construction of Affix Trees
COM '00 Proceedings of the 11th Annual Symposium on Combinatorial Pattern Matching
COM '00 Proceedings of the 11th Annual Symposium on Combinatorial Pattern Matching
On-Line Construction of Compact Directed Acyclic Word Graphs
CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
The Minimum DAWG for All Suffixes of a String and Its Applications
CPM '02 Proceedings of the 13th Annual Symposium on Combinatorial Pattern Matching
DS '00 Proceedings of the Third International Conference on Discovery Science
Discovering characteristic expressions in literary works
Theoretical Computer Science
Bidirectional construction of suffix trees
Nordic Journal of Computing - Special issue: Selected papers of the Prague Stringology conference (PSC'02), September 23-24, 2002
Computing forbidden words of regular languages
Fundamenta Informaticae - Special issue on computing patterns in strings
Compact suffix array: a space-efficient full-text index
Fundamenta Informaticae - Special issue on computing patterns in strings
On some applications of finite-state automata theory to natural language processing
Natural Language Engineering
Compact directed acyclic word graphs for a sliding window
Journal of Discrete Algorithms - SPIRE 2002
New text indexing functionalities of the compressed suffix arrays
Journal of Algorithms
Ternary directed acyclic word graphs
Theoretical Computer Science - Implementation and application of automata
Succinct suffix arrays based on run-length encoding
Nordic Journal of Computing
Succinct data structures for flexible text retrieval systems
Journal of Discrete Algorithms
Discrete Applied Mathematics
On-line construction of compact directed acyclic word graphs
Discrete Applied Mathematics - 12th annual symposium on combinatorial pattern matching (CPM)
SPIRE '08 Proceedings of the 15th International Symposium on String Processing and Information Retrieval
A Compression Method for Natural Language Automata
Proceedings of the 2009 conference on Finite-State Methods and Natural Language Processing: Post-proceedings of the 7th International Workshop FSMNLP 2008
Contracted Suffix Trees: A Simple and Dynamic Text Indexing Data Structure
CPM '09 Proceedings of the 20th Annual Symposium on Combinatorial Pattern Matching
On the Structure of Consistent Partitions of Substring Set of a Word
FAW '09 Proceedings of the 3d International Workshop on Frontiers in Algorithmics
The subsequence composition of a string
Theoretical Computer Science
General suffix automaton construction algorithm and space bounds
Theoretical Computer Science
General indexation of weighted automata: application to spoken utterance retrieval
SpeechIR '04 Proceedings of the Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval at HLT-NAACL 2004
On-line construction of compact directed acyclic word graphs
Discrete Applied Mathematics
A faster algorithm for matching a set of patterns with variable length don't cares
Information Processing Letters
The maximum order complexity of sequence ensembles
EUROCRYPT'91 Proceedings of the 10th annual international conference on Theory and application of cryptographic techniques
On the implementation of compact DAWG's
CIAA'02 Proceedings of the 7th international conference on Implementation and application of automata
Ternary directed acyclic word graphs
CIAA'03 Proceedings of the 8th international conference on Implementation and application of automata
Factor automata of automata and applications
CIAA'07 Proceedings of the 12th international conference on Implementation and application of automata
Algorithms and theory of computation handbook
Position heaps: A simple and dynamic text indexing data structure
Journal of Discrete Algorithms
Near real-time suffix tree construction via the fringe marked ancestor problem
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Top-k document retrieval in optimal time and linear space
Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms
B3-SDR and effective use of structural hints
INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
Sparse directed acyclic word graphs
SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
Journal of Discrete Algorithms
A general weighted grammar library
CIAA'04 Proceedings of the 9th international conference on Implementation and Application of Automata
Sturmian graphs and a conjecture of moser
DLT'04 Proceedings of the 8th international conference on Developments in Language Theory
A partition-based efficient algorithm for large scale multiple-strings matching
SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
Searching by corpus with fingerprints
Proceedings of the 15th International Conference on Extending Database Technology
On suffix extensions in suffix trees
Theoretical Computer Science
Computing forbidden words of regular languages
Fundamenta Informaticae - Computing Patterns in Strings
Compact Suffix Array — A Space-Efficient Full-Text Index
Fundamenta Informaticae - Computing Patterns in Strings
Efficient computation of substring equivalence classes with suffix arrays
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
General algorithms for mining closed flexible patterns under various equivalence relations
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Near real-time suffix tree construction via the fringe marked ancestor problem
Journal of Discrete Algorithms
Hi-index | 0.01 |
Given a finite set of texts S = {w1, … , wk} over some fixed finite alphabet &Sgr;, a complete inverted file for S is an abstract data type that provides the functions find(w), which returns the longest prefix of w that occurs (as a subword of a word) in S; freq(w), which returns the number of times w occurs in S; and locations(w), which returns the set of positions where w occurs in S. A data structure that implements a complete inverted file for S that occupies linear space and can be built in linear time, using the uniform-cost RAM model, is given. Using this data structure, the time for each of the above query functions is optimal. To accomplish this, techniques from the theory of finite automata and the work on suffix trees are used to build a deterministic finite automaton that recognizes the set of all subwords of the set S. This automaton is then annotated with additional information and compacted to facilitate the desired query functions. The result is a data structure that is smaller and more flexible than the suffix tree.