Complete inverted files for efficient text retrieval and analysis
Journal of the ACM (JACM)
On Compact Directed Acyclic Word Graphs
Structures in Logic and Computer Science, A Selection of Essays in Honor of Andrzej Ehrenfeucht
Optimal insertion in deterministic DAWGs
Theoretical Computer Science
A new algorithm for the construction of minimal acyclic DFAs
Science of Computer Programming
Stretching and jamming of automata
SAICSIT '03 Proceedings of the 2003 annual research conference of the South African institute of computer scientists and information technologists on Enablement through technology
On some applications of finite-state automata theory to natural language processing
Natural Language Engineering
Smaller representation of finite state automata
CIAA'11 Proceedings of the 16th international conference on Implementation and application of automata
Smaller representation of finite state automata
Theoretical Computer Science
Hi-index | 0.00 |
This paper deals with Finite State Automata used in Natural Language Processing to represent very large dictionaries. We present a method for an important operation applied to these automata, the compression with quick access. Our proposal is to factorize subautomata other than those representing common prefixes or suffixes. Our algorithm uses a DAWG of subautomata to iteratively choose the best substructure to factorize. The linear time accepting complexity is kept in the resulting compact automaton. Experiments performed on ten automata are reported.