Theoretical Computer Science
The R*-tree: an efficient and robust access method for points and rectangles
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Towards an analysis of range query performance in spatial data structures
PODS '93 Proceedings of the twelfth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
CIKM '93 Proceedings of the second international conference on Information and knowledge management
Beyond uniformity and independence: analysis of R-trees using the concept of fractal dimension
PODS '94 Proceedings of the thirteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
An introduction to the analysis of algorithms
An introduction to the analysis of algorithms
A model for the prediction of R-tree performance
PODS '96 Proceedings of the fifteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Block addressing indices for approximate text retrieval
CIKM '97 Proceedings of the sixth international conference on Information and knowledge management
Multidimensional access methods
ACM Computing Surveys (CSUR)
PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric
Journal of the ACM (JACM)
A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM (JACM)
A guided tour to approximate string matching
ACM Computing Surveys (CSUR)
ACM Computing Surveys (CSUR)
Combinatorial Algorithms on Words
Combinatorial Algorithms on Words
The K-D-B-tree: a search structure for large multidimensional dynamic indexes
SIGMOD '81 Proceedings of the 1981 ACM SIGMOD international conference on Management of data
R-trees: a dynamic index structure for spatial searching
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Probabilistic Analysis of Generalized Suffix Trees (Extended Abstract)
CPM '92 Proceedings of the Third Annual Symposium on Combinatorial Pattern Matching
Approximate String-Matching over Suffix Trees
CPM '93 Proceedings of the 4th Annual Symposium on Combinatorial Pattern Matching
Filtration with q-Samples in Approximate String Matching
CPM '96 Proceedings of the 7th Annual Symposium on Combinatorial Pattern Matching
An Effective Clustering Algorithm to Index High Dimensional Metric Spaces
SPIRE '00 Proceedings of the Seventh International Symposium on String Processing Information Retrieval (SPIRE'00)
GLIMPSE: a tool to search through entire file systems
WTEC'94 Proceedings of the USENIX Winter 1994 Technical Conference on USENIX Winter 1994 Technical Conference
ACM Computing Surveys (CSUR)
A seriate coverage filtration approach for homology search
Proceedings of the 2004 ACM symposium on Applied computing
A new method for approximate indexing and dictionarylookup with one error
Information Processing Letters
A metric index for approximate string matching
Theoretical Computer Science
An approximate multi-word matching algorithm for robust document retrieval
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Compressed indexes for approximate string matching
ESA'06 Proceedings of the 14th conference on Annual European Symposium - Volume 14
Theoretical Computer Science
EXTRA: a system for example-based translation assistance
Machine Translation
Journal of Discrete Algorithms
OASIS: an online and accurate technique for local-alignment searches on biological sequences
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
From Nerode's congruence to suffix automata with mismatches
Theoretical Computer Science
A comprehensive trainable error model for sung music queries
Journal of Artificial Intelligence Research
A new method for approximate indexing and dictionary lookup with one error
Information Processing Letters
Algorithms for memory hierarchies: advanced lectures
Algorithms for memory hierarchies: advanced lectures
On the suffix automaton with mismatches
CIAA'07 Proceedings of the 12th international conference on Implementation and application of automata
A linear size index for approximate pattern matching
Journal of Discrete Algorithms
A linear size index for approximate pattern matching
CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
Trying to outperform a well-known index with a sequential scan
Proceedings of the Joint EDBT/ICDT 2013 Workshops
Hi-index | 0.00 |
We present a radically new indexing approach for approximate string matching. The scheme uses the metric properties of the edit distance and can be applied to any other metric between strings. We build a metric space where the sites are the nodes of the suffix tree of the text, and the approximate query is seen as a proximity query on that metric space. This permits us finding the R occurrences of a pattern of length m in a text of length n in average time O(mlog2 n+m2+R), using O(n log n) space and O(n log2 n) index construction time. This complexity improves by far over all other previous methods. We also show a simpler scheme needing O(n) space.