Simple fast algorithms for the editing distance between trees and related problems
SIAM Journal on Computing
The String-to-String Correction Problem
Journal of the ACM (JACM)
The Tree-to-Tree Correction Problem
Journal of the ACM (JACM)
The Theory of Probabilistic Databases
VLDB '87 Proceedings of the 13th International Conference on Very Large Data Bases
Comparing Hierarchical Data in External Memory
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Detecting Changes in XML Documents
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
A Probabilistic XML Approach to Data Integration
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
DTD-Diff: A change detection algorithm for DTDs
Data & Knowledge Engineering
ProTDB: probabilistic data in XML
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Matching twigs in probabilistic XML
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Query efficiency in probabilistic XML models
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
On the expressiveness of probabilistic XML models
The VLDB Journal — The International Journal on Very Large Data Bases
Querying and updating probabilistic information in XML
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Hi-index | 0.00 |
Probabilistic XML is a hierarchical data model capturing uncertainty of both value and structure. The ability to compute the similarity between an XML document and a probabilistic XML document is a building block of many applications involving querying, comparison, alignment and classification, for instance. The new challenge in efficiently computing such similarity is the multiplicity of the possible worlds represented by a probabilistic XML document. We devise and discuss an algorithm for the efficient computation of the similarity between an XML document and a probabilistic XML document. We empirically and comparatively evaluate the performance of the algorithm and its variants.