Simple fast algorithms for the editing distance between trees and related problems
SIAM Journal on Computing
Relaxing the Triangle Inequality in Pattern Matching
International Journal of Computer Vision
Data structures and algorithms for nearest neighbor search in general metric spaces
SODA '93 Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms
The Tree-to-Tree Correction Problem
Journal of the ACM (JACM)
Rank aggregation methods for the Web
Proceedings of the 10th international conference on World Wide Web
XIRQL: a query language for information retrieval in XML documents
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
The Index-Based XXL Search Engine for Querying XML Data with Relevance Ranking
EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Edit Distance with Move Operations
CPM '02 Proceedings of the 13th Annual Symposium on Combinatorial Pattern Matching
Alignment of Trees - An Alternative to Tree Edit
CPM '94 Proceedings of the 5th Annual Symposium on Combinatorial Pattern Matching
XRANK: ranked keyword search over XML documents
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
SIAM Journal on Discrete Mathematics
An Efficient and Scalable Algorithm for Clustering XML Documents by Structure
IEEE Transactions on Knowledge and Data Engineering
The overlap problem in content-oriented XML retrieval evaluation
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Fast Detection of XML Structural Similarity
IEEE Transactions on Knowledge and Data Engineering
Comparing and aggregating rankings with ties
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
DogmatiX tracks down duplicates in XML
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Similarity evaluation on tree-structured data
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Controlling overlap in content-oriented XML retrieval
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
A survey on tree edit distance and related problems
Theoretical Computer Science
XSEarch: a semantic search engine for XML
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Merging the results of approximate match operations
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Measuring the structural similarity of semistructured documents using entropy
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
A methodology for clustering XML documents by structure
Information Systems
Approximating tree edit distance through string edit distance
ISAAC'06 Proceedings of the 17th international conference on Algorithms and Computation
Hi-index | 0.00 |
Systems that produce ranked lists of results are abundant. For instance, Web search engines return ranked lists of Web pages. There has been work on distance measure for list permutations, like Kendall tau and Spearman's footrule, as well as extensions to handle top-k lists, which are more common in practice. In addition to ranking whole objects (e.g., Web pages), there is an increasing number of systems that provide keyword search on XML or other semistructured data, and produce ranked lists of XML sub-trees. Unfortunately, previous distance measures are not suitable for ranked lists of sub-trees since they do not account for the possible overlap between the returned sub-trees. That is, two sub-trees differing by a single node would be considered separate objects. In this paper, we present the first distance measures for ranked lists of sub-trees, and show under what conditions these measures are metrics. Furthermore, we present algorithms to efficiently compute these distance measures. Finally, we evaluate and compare the proposed measures on real data using three popular XML keyword proximity search systems.