The Tree-to-Tree Correction Problem
Journal of the ACM (JACM)
Computing the Edit-Distance between Unrooted Ordered Trees
ESA '98 Proceedings of the 6th Annual European Symposium on Algorithms
Approximate schemas, source-consistency and query answering
Journal of Intelligent Information Systems
An optimal decomposition algorithm for tree edit distance
ACM Transactions on Algorithms (TALG)
ACM SIGIR Forum
A methodology for clustering XML documents by structure
Information Systems
Analysis of tree edit distance algorithms
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Flexible document-query matching based on a probabilistic content and structure score combination
Proceedings of the 2010 ACM Symposium on Applied Computing
Overview of the INEX 2009 ad hoc track
INEX'09 Proceedings of the Focused retrieval and evaluation, and 8th international conference on Initiative for the evaluation of XML retrieval
INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
Searching XML documents: preliminary work
INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
SIRIUS: a lightweight XML indexing and approximate search system at INEX 2005
INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
Narrowed extended XPath i (NEXI)
INEX'04 Proceedings of the Third international conference on Initiative for the Evaluation of XML Retrieval
Component ranking and automatic query refinement for XML retrieval
INEX'04 Proceedings of the Third international conference on Initiative for the Evaluation of XML Retrieval
Retrieving documents with mathematical content
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Structural similarity search for mathematics retrieval
CICM'13 Proceedings of the 2013 international conference on Intelligent Computer Mathematics
Hi-index | 0.00 |
Semi-structured Information Retrieval (SIR) allows the user to narrow his search down to the element level. As queries and XML documents can be seen as hierarchically nested elements, we consider that their structural proximity can be evaluated through their trees similarity. Our approach combines both content and structure scores, the latter being based on tree edit distance (minimal cost of operations to turn one tree to another). We use the tree structure to propagate and combine both measures. Moreover, to overcome time and space complexity, we summarize the document tree structure. We experimented various tree summary techniques as well as our original model using the SSCAS task of the INEX 2005 campaign. Results showed that our approach outperforms state of the art ones.