Faster and Space-Optimal Edit Distance "1" Dictionary

Authors:
Djamal Belazzougui
Affiliations:
Ineodev Company, Ecole Nationale Supérieure d'Informatique, Algiers, Algeria
Venue:
CPM '09 Proceedings of the 20th Annual Symposium on Combinatorial Pattern Matching
Year:
2009

Citing 10
Cited 5

Efficient Storage and Retrieval by Content and Address of Static Files

Journal of the ACM (JACM)
Improved bounds for dictionary look-up with one error

Information Processing Letters
Efficient Minimal Perfect Hashing in Nearly Minimal Space

STACS '01 Proceedings of the 18th Annual Symposium on Theoretical Aspects of Computer Science
Approximate Dictionary Queries

CPM '96 Proceedings of the 7th Annual Symposium on Combinatorial Pattern Matching
Polynomial Hash Functions Are Reliable (Extended Abstract)

ICALP '92 Proceedings of the 19th International Colloquium on Automata, Languages and Programming
Dictionary matching and indexing with errors and don't cares

STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
Representing Trees of Higher Degree

Algorithmica
Advanced Data Structures

Advanced Data Structures
Text indexing with errors

CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
Simple and space-efficient minimal perfect hash functions

WADS'07 Proceedings of the 10th international conference on Algorithms and Data Structures

Indexing methods for approximate dictionary searching: Comparative analysis

Journal of Experimental Algorithmics (JEA)
Compressed string dictionary look-up with edit distance one

CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
Super-Linear indices for approximate dictionary searching

SISAP'12 Proceedings of the 5th international conference on Similarity Search and Applications
Flexible and efficient string similarity search with alignment-space transform

Proceedings of the 7th International Conference on Ubiquitous Information Management and Communication
Efficient fuzzy search in large text collections

ACM Transactions on Information Systems (TOIS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the approximate dictionary search problem we have to construct a data structure on a set of strings so that we can answer to queries of the kind: find all strings of the set that are similar (according to some string distance) to a given string. In this paper we propose the first data structure for approximate dictionary search that occupies optimal space (up to a constant factor) and able to answer an approximate query for edit distance "1" (report all strings of dictionary that are at edit distance at most "1" from query string) in time linear in the length of query string. Based on our new dictionary we propose a full-text index for approximate queries with edit distance "1" (report all positions of all sub-strings of the text that are at edit distance at most "1" from query string) answering to a query in time linear in the length of query string using space $O(n(\lg(n)\lg\lg(n))^2)$ in the worst case on a text of length n . Our index is the first index that answers queries in time linear in the length of query string while using space O (n ·poly (log (n ))) in the worst case and for any alphabet size.