Simple fast algorithms for the editing distance between trees and related problems
SIAM Journal on Computing
IEEE Transactions on Pattern Analysis and Machine Intelligence
Comparison of fast nearest neighbour classifiers for handwritten character recognition
Pattern Recognition Letters
RNA Secondary structure comparison: exact analysis of the Zhang--Shasha tree edit algorithm
Theoretical Computer Science
A Probabilistic Approach to Learning Costs for Graph Edit Distance
ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 3 - Volume 03
A survey on tree edit distance and related problems
Theoretical Computer Science
Learning stochastic tree edit distance
ECML'06 Proceedings of the 17th European conference on Machine Learning
SEDiL: Software for Edit Distance Learning
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Efficient change control of XML documents
Proceedings of the 9th ACM symposium on Document engineering
Automatic cost estimation for tree edit distance using particle swarm optimization
ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Optimizing textual entailment recognition using particle swarm optimization
TextInfer '09 Proceedings of the 2009 Workshop on Applied Textual Inference
Learning state machine-based string edit kernels
Pattern Recognition
Tree edit models for recognizing textual entailments, paraphrases, and answers to questions
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Learning good edit similarities with generalization guarantees
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
On the usefulness of similarity based projection spaces for transfer learning
SIMBAD'11 Proceedings of the First international conference on Similarity-based pattern recognition
Probabilistic finite state machines for regression-based MT evaluation
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Hi-index | 0.01 |
Nowadays, there is a growing interest in machine learning and pattern recognition for tree-structured data. Trees actually provide a suitable structural representation to deal with complex tasks such as web information extraction, RNA secondary structure prediction, computer music, or conversion of semi-structured data (e.g. XML documents). Many applications in these domains require the calculation of similarities over pairs of trees. In this context, the tree edit distance (ED) has been subject of investigations for many years in order to improve its computational efficiency. However, used in its classical form, the tree ED needs a priori fixed edit costs which are often difficult to tune, that leaves little room for tackling complex problems. In this paper, to overcome this drawback, we focus on the automatic learning of a non-parametric stochastic tree ED. More precisely, we are interested in two kinds of probabilistic approaches. The first one builds a generative model of the tree ED from a joint distribution over the edit operations, while the second works from a conditional distribution providing then a discriminative model. To tackle these tasks, we present an adaptation of the expectation-maximization algorithm for learning these distributions over the primitive edit costs. Two experiments are conducted. The first is achieved on artificial data and confirms the interest to learn a tree ED rather than a priori imposing edit costs; The second is applied to a pattern recognition task aiming to classify handwritten digits.