Numerical Similarity and Dissimilarity Measures Between Two Trees

Authors:
B. J. Oommen;K. Zhang;W. Lee
Affiliations:
-;-;-
Venue:
IEEE Transactions on Computers
Year:
1996

Citing 13
Cited 5

Recognition of Noisy Subsequences Using Constrained Edit Distances

IEEE Transactions on Pattern Analysis and Machine Intelligence
Simple fast algorithms for the editing distance between trees and related problems

SIAM Journal on Computing
The String-to-String Correction Problem

Journal of the ACM (JACM)
An Extension of the String-to-String Correction Problem

Journal of the ACM (JACM)
Bounds on the Complexity of the Longest Common Subsequence Problem

Journal of the ACM (JACM)
Algorithms for the Longest Common Subsequence Problem

Journal of the ACM (JACM)
The Complexity of Some Problems on Subsequences and Supersequences

Journal of the ACM (JACM)
The Tree-to-Tree Correction Problem

Journal of the ACM (JACM)
Approximate String Matching

ACM Computing Surveys (CSUR)
A fast algorithm for computing longest common subsequences

Communications of the ACM
A linear space algorithm for computing maximal common subsequences

Communications of the ACM
String similarity and misspellings

Communications of the ACM
The Design and Analysis of Computer Algorithms

The Design and Analysis of Computer Algorithms

On the Pattern Recognition of Noisy Subsequence Trees

IEEE Transactions on Pattern Analysis and Machine Intelligence
Syntactic Pattern Recognition by Error Correcting Analysis on Tree Automata

Proceedings of the Joint IAPR International Workshops on Advances in Pattern Recognition
Distance Measures for Information System Reengineering

CAiSE '00 Proceedings of the 12th International Conference on Advanced Information Systems Engineering
Seminal: searching for ML type-error messages

Proceedings of the 2006 workshop on ML
Distributed recursive learning for shape recognition through multiscale trees

Image and Vision Computing

Quantified Score

Hi-index	14.98

Visualization

Abstract

Quantifying the measure of similarity between two trees is a problem of intrinsic importance in the study of algorithms and data structures and has applications in computational molecular biology, structural/syntactic pattern recognition and in data management. In this paper we define and formulate an abstract measure of comparison, 驴(T1, T2), between two trees T1 and T2 presented in terms of a set of elementary intersymbol measures 驴(., .) and two abstract operators $\oplus$ and $\otimes$. By appropriately choosing the concrete values for these two operators and for 驴(., .), this measure can be used to define various quantities including 1) the edit distance between two trees, 2) the size of their largest common subtree, 3) Prob(T2 | T1), the probability of receiving T2 given that T1 was transmitted across a channel causing independent substitution and deletion errors, and 4) the a posteriori probability of T1 being the transmitted tree given that T2 is the received tree containing independent substitution, insertion and deletion errors. The recursive properties of 驴(T1, T2) have been derived and a single generic iterative dynamic programming scheme to compute all the above quantities has been developed. The time and space complexities of the algorithm have been analyzed and the implications of our results in both theoretical and applied fields has been discussed.