PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric
Journal of the ACM (JACM)
Theoretical Computer Science - Special issue on implementing automata
Communications of the ACM
Incremental construction of minimal acyclic finite-state automata
Computational Linguistics - Special issue on finite-state methods in NLP
Dictionary matching and indexing with errors and don't cares
STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
CIAA'02 Proceedings of the 7th international conference on Implementation and application of automata
Haplotype inference by pure Parsimony
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Indexing a dictionary for subset matching queries
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
Incremental and semi-incremental construction of pseudo-minimal automata
CIAA'05 Proceedings of the 10th international conference on Implementation and Application of Automata
Hi-index | 0.89 |
We address the problem of building an index for a set D of n strings, where each string location is a subset of some finite integer alphabet of size @s, so that we can answer efficiently if a given simple query string (where each string location is a single symbol) p occurs in the set. That is, we need to efficiently find a string d@?D such that p[i]@?d[i] for every i. We show how to build such index in O(n^l^o^g^"^@s^"^/^"^@D^(^@s^)log(n)) average time, where @D is the average size of the subsets. Our methods have applications e.g. in computational biology (haplotype inference) and music information retrieval.