Introducing efficient parallelism into approximate string matching and a new serial algorithm
STOC '86 Proceedings of the eighteenth annual ACM symposium on Theory of computing
Approximate string matching: a simpler faster algorithm
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Communication complexity of document exchange
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Efficient approximate and dynamic matching of patterns using a labeling paradigm
FOCS '96 Proceedings of the 37th Annual Symposium on Foundations of Computer Science
Estimating the weight of metric minimum spanning trees in sublinear-time
STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
Image similarity search with compact data structures
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Low distortion embeddings for edit distance
Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
The intractability of computing the Hamming distance
Theoretical Computer Science
Nonembeddability theorems via Fourier analysis
FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science
Oblivious string embeddings and edit distance approximations
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Improved lower bounds for embeddings into L1
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Tolerant property testing and distance approximation
Journal of Computer and System Sciences
Estimating the sortedness of a data stream
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Low distortion embeddings for edit distance
Journal of the ACM (JACM)
Overcoming the l1 non-embeddability barrier: algorithms for product metrics
SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Property Testing: A Learning Theory Perspective
Foundations and Trends® in Machine Learning
Approximating edit distance in near-linear time
Proceedings of the forty-first annual ACM symposium on Theory of computing
Periodicity testing with sublinear samples and space
ACM Transactions on Algorithms (TALG)
Algorithmic and Analysis Techniques in Property Testing
Foundations and Trends® in Theoretical Computer Science
Property testing and parameter testing for permutations
SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
Near-optimal sublinear time algorithms for Ulam distance
SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
Approximate Satisfiability and Equivalence
SIAM Journal on Computing
The Computational Hardness of Estimating Edit Distance
SIAM Journal on Computing
Testing permutation properties through subpermutations
Theoretical Computer Science
Finding frequent patterns in a string in sublinear time
ESA'05 Proceedings of the 13th annual European conference on Algorithms
SIAM Journal on Discrete Mathematics
The smoothed complexity of edit distance
ACM Transactions on Algorithms (TALG)
Improved sketching of hamming distance with error correcting
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
Efficient communication protocols for deciding edit distance
ESA'12 Proceedings of the 20th Annual European conference on Algorithms
Sequential pattern mining -- approaches and algorithms
ACM Computing Surveys (CSUR)
Homomorphic fingerprints under misalignments: sketching edit and shift distances
Proceedings of the forty-fifth annual ACM symposium on Theory of computing
Hi-index | 0.01 |
We show how to determine whether the edit distance between two given strings is small in sublinear time. Specifically, we present a test which, given two n-character strings A and B, runs in time o(n) and with high probability returns "CLOSE" if their edit distance is O(nΑ), and "FAR" if their edit distance is Ω(n), where Α is a fixed parameter less than 1. Our algorithm for testing the edit distance works by recursively subdividing the strings A and B into smaller substrings and looking for pairs of substrings in A, B with small edit distance. To do this, we query both strings at random places using a special technique for economizing on the samples which does not pick the samples independently and provides better query and overall complexity. As a result, our test runs in time Õ(nmax(Α/2, 2Α - 1\)) for any fixed Α Α/2) on the query complexity of every algorithm that distinguishes pairs of strings with edit distance at most nΑ from those with edit distance at least n/6.