An efficient representation for sparse sets
ACM Letters on Programming Languages and Systems (LOPLAS)
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM (JACM)
Dictionary matching and indexing with errors and don't cares
STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
Linear pattern matching algorithms
SWAT '73 Proceedings of the 14th Annual Symposium on Switching and Automata Theory (swat 1973)
Finding patterns with variable length gaps or don’t cares
COCOON'06 Proceedings of the 12th annual international conference on Computing and Combinatorics
Succinct Text Indexing with Wildcards
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Succincter text indexing with wildcards
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
A succinct index for hypertext
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Compressed text indexing with wildcards
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
An index structure for spaced seed search
ISAAC'11 Proceedings of the 22nd international conference on Algorithms and Computation
SecTTS: A secure track & trace system for RFID-enabled supply chains
Computers in Industry
String indexing for patterns with wildcards
SWAT'12 Proceedings of the 13th Scandinavian conference on Algorithm Theory
Journal of Discrete Algorithms
Compressed text indexing with wildcards
Journal of Discrete Algorithms
Compressed indexes for text with wildcards
Theoretical Computer Science
Hi-index | 0.00 |
Given a text T of length n, the classical indexing problem for pattern matching is to build an index for T so that for any query pattern P, we can report efficiently all occurrences of P in T. Cole et al (2004) extended this problem to allow don't care characters (wildcards) in the text and pattern, and they gave the first index that supports efficient pattern matching. The space complexity of this index is linear in n (text length) but exponential in terms of the number of wildcards. Motivated by bioinformatics applications, we investigate indexes whose size depends on n only. In the literature, space efficient indexes for wildcard matching are known only for the special case when wildcards appear only in the pattern (Iliopoulos and Rahman, 2007); the space required is O(n). Not much has been heard for the case when wildcards appear in the text only, or in both the text and pattern. In this paper we give an O(n) space index to support efficient wildcard matching in all three cases. Our solution to the pattern-only case improves the matching time of the previous work tremendously in practice. In addition, our solution can be extended to handle optional wildcards, each of which can match zero or one character.