The Tree-to-Tree Correction Problem
Journal of the ACM (JACM)
Communications of the ACM
Probabilistic models of information retrieval based on measuring the divergence from randomness
ACM Transactions on Information Systems (TOIS)
Fuzzy Techniques for XML Data Smushing
Proceedings of the International Conference, 7th Fuzzy Days on Computational Intelligence, Theory and Applications
LATIN '00 Proceedings of the 4th Latin American Symposium on Theoretical Informatics
Computing the Edit-Distance between Unrooted Ordered Trees
ESA '98 Proceedings of the 6th Annual European Symposium on Algorithms
A survey on tree edit distance and related problems
Theoretical Computer Science
Why structural hints in queries do not help XML-retrieval
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Focused Access to XML Documents
Automatic cost estimation for tree edit distance using particle swarm optimization
ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Analysis of tree edit distance algorithms
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Flexible document-query matching based on a probabilistic content and structure score combination
Proceedings of the 2010 ACM Symposium on Applied Computing
Overview of the INEX 2010 data centric track
INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval
UPF at INEX 2010: towards query-type based focused retrieval
INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval
University of Otago at INEX 2010
INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval
INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
SIRIUS: a lightweight XML indexing and approximate search system at INEX 2005
INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
Why using structural hints in XML retrieval?
FQAS'06 Proceedings of the 7th international conference on Flexible Query Answering Systems
Estimating structural relevance of XML elements through language model
Proceedings of the 10th Conference on Open Research Areas in Information Retrieval
Hi-index | 0.00 |
In this paper we present a Structured Information Retrieval (SIR) model based on graph matching. Our approach combines content propagation, which handles sibling relationships, with a document-query structure matching process. The latter is based on Tree-Edit Distance (TED) which is the minimum set of insert, delete, and replace operations to turn one tree to another. To our knowledge this algorithm has never been used in ad-hoc SIR. As the effectiveness of TED relies both on the input tree and the edit costs, we first present a focused subtree extraction technique which selects the most representative elements of the document w.r.t the query. We then describe our TED costs setting based on the Document Type Definition (DTD). Finally we discuss our results according to the type of the collection (data-oriented or text-oriented). Experiments are conducted on two INEX test sets: the 2010 Datacentric collection and the 2005 Ad-hoc one.