STOC '86 Proceedings of the eighteenth annual ACM symposium on Theory of computing
Introducing efficient parallelism into approximate string matching and a new serial algorithm
STOC '86 Proceedings of the eighteenth annual ACM symposium on Theory of computing
Parallel symmetry-breaking in sparse graphs
STOC '87 Proceedings of the nineteenth annual ACM symposium on Theory of computing
SIAM Journal on Computing
Fast algorithms for approximately counting mismatches
Information Processing Letters
Symmetry breaking for suffix tree construction
STOC '94 Proceedings of the twenty-sixth annual ACM symposium on Theory of computing
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
Approximate nearest neighbors: towards removing the curse of dimensionality
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Efficient search for approximate nearest neighbor in high dimensional spaces
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Approximate string matching: a simpler faster algorithm
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Approximate nearest neighbors and sequence comparison with block operations
STOC '00 Proceedings of the thirty-second annual ACM symposium on Theory of computing
Communication complexity of document exchange
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Faster algorithms for string matching with k mismatches
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Pattern matching in dynamic texts
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Reductions among high dimensional proximity problems
SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
Rapid identification of repeated patterns in strings, trees and arrays
STOC '72 Proceedings of the fourth annual ACM symposium on Theory of computing
Efficient approximate and dynamic matching of patterns using a labeling paradigm
FOCS '96 Proceedings of the 37th Annual Symposium on Foundations of Computer Science
Efficient randomized pattern-matching algorithms
IBM Journal of Research and Development - Mathematics and computing
Approximate nearest neighbor algorithms for Frechet distance via product metrics
Proceedings of the eighteenth annual symposium on Computational geometry
Lower bounds for embedding edit distance into normed spaces
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Correlating XML data streams using tree-edit distance embeddings
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Approximate Nearest Neighbor under edit distance via product metrics
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Low distortion embeddings for edit distance
Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
XML stream processing using tree-edit distance embeddings
ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2003
Robust and fast similarity search for moving object trajectories
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Substring compression problems
SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
Nonembeddability theorems via Fourier analysis
FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science
The greedy algorithm for the minimum common string partition problem
ACM Transactions on Algorithms (TALG)
The greedy algorithm for edit distance with moves
Information Processing Letters
Improved output-sensitive snap rounding
Proceedings of the twenty-second annual symposium on Computational geometry
Data streams: algorithms and applications
Foundations and Trends® in Theoretical Computer Science
Clustering and indexing of experience sequences for popularity-driven recommendations
Proceedings of the 3rd ACM workshop on Continuous archival and retrival of personal experences
Bottom-Up Extraction and Trust-Based Refinement of Ontology Metadata
IEEE Transactions on Knowledge and Data Engineering
Approximating reversal distance for strings with bounded number of duplicates
Discrete Applied Mathematics
Edit distance with move operations
Journal of Discrete Algorithms
Estimating the sortedness of a data stream
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Low distortion embeddings for edit distance
Journal of the ACM (JACM)
Distance measures for biological sequences: Some recent approaches
International Journal of Approximate Reasoning
L1 pattern matching lower bound
Information Processing Letters
Approximate schemas, source-consistency and query answering
Journal of Intelligent Information Systems
Traffic Aggregation for Malware Detection
DIMVA '08 Proceedings of the 5th international conference on Detection of Intrusions and Malware, and Vulnerability Assessment
Analysis of tree edit distance on XML data
CIIT '07 The Sixth IASTED International Conference on Communications, Internet, and Information Technology
The greedy algorithm for edit distance with moves
Information Processing Letters
A novel greedy algorithm for the minimum common string partition problem
ISBRA'07 Proceedings of the 3rd international conference on Bioinformatics research and applications
Squeezing long sequence data for efficient similarity search
APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
Ontology and instance matching
Knowledge-driven multimedia information extraction and ontology evolution
Quick greedy computation for minimum common string partitions
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Sorting by transpositions is difficult
ICALP'11 Proceedings of the 38th international colloquim conference on Automata, languages and programming - Volume Part I
Exponential and polynomial time algorithms for the minimum common string partition problem
COCOA'11 Proceedings of the 5th international conference on Combinatorial optimization and applications
ICDT'07 Proceedings of the 11th international conference on Database Theory
Locally consistent parsing and applications to approximate string comparisons
DLT'05 Proceedings of the 9th international conference on Developments in Language Theory
Approximate matching in the L1 metric
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
Approximating reversal distance for strings with bounded number of duplicates
MFCS'05 Proceedings of the 30th international conference on Mathematical Foundations of Computer Science
Minimum common string partition problem: hardness and approximations
ISAAC'04 Proceedings of the 15th international conference on Algorithms and Computation
Approximating tree edit distance through string edit distance
ISAAC'06 Proceedings of the 17th international conference on Algorithms and Computation
Reversal distance for strings with duplicates: linear time approximation using hitting set
WAOA'06 Proceedings of the 4th international conference on Approximation and Online Algorithms
L1 pattern matching lower bound
SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
Minimum common string partition revisited
Journal of Combinatorial Optimization
Alignments with non-overlapping moves, inversions and tandem duplications in O(n4) time
COCOON'07 Proceedings of the 13th annual international conference on Computing and Combinatorics
Approximate verification and enumeration problems
ICTAC'12 Proceedings of the 9th international conference on Theoretical Aspects of Computing
Sequential pattern mining -- approaches and algorithms
ACM Computing Surveys (CSUR)
An Improved Approximation Algorithm for Scaffold Filling to Maximize the Common Adjacencies
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Hi-index | 0.01 |
The edit distance between two strings S and R is defined to be the minimum number of character inserts, deletes and changes needed to convert R to S. Given a text string t of length n, and a pattern string p of length m, informally, the string edit distance matching problem is to compute the smallest edit distance between p and substrings of t. A well known dynamic programming algorithm takes time O(nm) to solve this problem, and it is an important open problem in Combinatorial Pattern Matching to significantly improve this bound.We relax the problem so that (a) we allow an additional operation, namely, substring moves, and (b) we approximate the string edit distance upto a factor of O(log n log*n). Our result is a near linear time deterministic algorithm for this version of the problem. This is the first known significantly subquadratic algorithm for a string edit distance problem in which the distance involves nontrivial alignments. Our results are obtained by embedding strings into L1 vector space using a simplified parsing technique we call Edit Sensitive Parsing (ESP). This embedding is approximately distance preserving, and we show many applications of this embedding to string proximity problems including nearest neighbors, outliers, and streaming computations with strings.