Recognition of Noisy Subsequences Using Constrained Edit Distances
IEEE Transactions on Pattern Analysis and Machine Intelligence
Simple fast algorithms for the editing distance between trees and related problems
SIAM Journal on Computing
The String-to-String Correction Problem
Journal of the ACM (JACM)
An Extension of the String-to-String Correction Problem
Journal of the ACM (JACM)
Bounds on the Complexity of the Longest Common Subsequence Problem
Journal of the ACM (JACM)
Algorithms for the Longest Common Subsequence Problem
Journal of the ACM (JACM)
The Complexity of Some Problems on Subsequences and Supersequences
Journal of the ACM (JACM)
The Tree-to-Tree Correction Problem
Journal of the ACM (JACM)
ACM Computing Surveys (CSUR)
A fast algorithm for computing longest common subsequences
Communications of the ACM
A linear space algorithm for computing maximal common subsequences
Communications of the ACM
String similarity and misspellings
Communications of the ACM
The Design and Analysis of Computer Algorithms
The Design and Analysis of Computer Algorithms
On the Pattern Recognition of Noisy Subsequence Trees
IEEE Transactions on Pattern Analysis and Machine Intelligence
Syntactic Pattern Recognition by Error Correcting Analysis on Tree Automata
Proceedings of the Joint IAPR International Workshops on Advances in Pattern Recognition
Distance Measures for Information System Reengineering
CAiSE '00 Proceedings of the 12th International Conference on Advanced Information Systems Engineering
Seminal: searching for ML type-error messages
Proceedings of the 2006 workshop on ML
Distributed recursive learning for shape recognition through multiscale trees
Image and Vision Computing
Hi-index | 14.98 |
Quantifying the measure of similarity between two trees is a problem of intrinsic importance in the study of algorithms and data structures and has applications in computational molecular biology, structural/syntactic pattern recognition and in data management. In this paper we define and formulate an abstract measure of comparison, 驴(T1, T2), between two trees T1 and T2 presented in terms of a set of elementary intersymbol measures 驴(., .) and two abstract operators $\oplus$ and $\otimes$. By appropriately choosing the concrete values for these two operators and for 驴(., .), this measure can be used to define various quantities including 1) the edit distance between two trees, 2) the size of their largest common subtree, 3) Prob(T2 | T1), the probability of receiving T2 given that T1 was transmitted across a channel causing independent substitution and deletion errors, and 4) the a posteriori probability of T1 being the transmitted tree given that T2 is the received tree containing independent substitution, insertion and deletion errors. The recursive properties of 驴(T1, T2) have been derived and a single generic iterative dynamic programming scheme to compute all the above quantities has been developed. The time and space complexities of the algorithm have been analyzed and the implications of our results in both theoretical and applied fields has been discussed.