Suffix arrays: a new method for on-line string searches
SODA '90 Proceedings of the first annual ACM-SIAM symposium on Discrete algorithms
A guided tour to approximate string matching
ACM Computing Surveys (CSUR)
High-order entropy-compressed text indexes
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
On Compact Directed Acyclic Word Graphs
Structures in Logic and Computer Science, A Selection of Essays in Honor of Andrzej Ehrenfeucht
Opportunistic data structures with applications
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
A Fast Algorithm on Average for All-Against-All Sequence Matching
SPIRE '99 Proceedings of the String Processing and Information Retrieval Symposium & International Workshop on Groupware
Average-Case Analysis of Approximate Trie Search
Algorithmica
Ultra-succinct representation of ordered trees
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
OASIS: an online and accurate technique for local-alignment searches on biological sequences
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Compressed Suffix Trees with Full Functionality
Theory of Computing Systems
Compressed indexing and local alignment of DNA
Bioinformatics
Linear pattern matching algorithms
SWAT '73 Proceedings of the 14th Annual Symposium on Switching and Automata Theory (swat 1973)
Suffix Tree Based Approach for Chinese Information Retrieval
ISDA '08 Proceedings of the 2008 Eighth International Conference on Intelligent Systems Design and Applications - Volume 03
Suffix tree based data compression
SOFSEM'05 Proceedings of the 31st international conference on Theory and Practice of Computer Science
Hi-index | 0.00 |
Suffix tree, suffix array, and directed acyclic word graph (DAWG) are data-structures for indexing a text. Although they enable efficient pattern matching, their data-structures require O(n log n) bits, which make them impractical to index long text like human genome. Recently, the development of compressed data-structures allow us to simulate suffix tree and suffix array using O(n) bits. However, there is still no O(n)-bit data-structure for DAWG with full functionality. This work introduces an O(n)-bit data-structure for simulating DAWG. Besides, we also propose an application of DAWG to improve the time complexity for the local alignment problem. In this application, the previously proposed solutions using BWT (a version of compressed suffix tree) run in O(n2m) worst case time and O(nm0.628) average case time where n and m are the lengths of the database and the query, respectively. Using compressed DAWG proposed in this paper, the problem can be solved in O(nm) worst case time and the same average case time.