Simple fast algorithms for the editing distance between trees and related problems
SIAM Journal on Computing
Pattern matching algorithms
XRel: a path-based approach to storage and retrieval of XML documents using relational databases
ACM Transactions on Internet Technology (TOIT)
Comparing Hierarchical Data in External Memory
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
A Path-sequence Based Discrimination for Subtree Matching in Approximate XML Joins
ICDEW '06 Proceedings of the 22nd International Conference on Data Engineering Workshops
XML Data Integration Based on Content and Structure Similarity Using Keys
OTM '08 Proceedings of the OTM 2008 Confederated International Conferences, CoopIS, DOA, GADA, IS, and ODBASE 2008. Part I on On the Move to Meaningful Internet Systems:
OTM '08 Proceedings of the OTM 2008 Confederated International Conferences, CoopIS, DOA, GADA, IS, and ODBASE 2008. Part II on On the Move to Meaningful Internet Systems
Using information content to evaluate semantic similarity in a taxonomy
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
A system for detecting xml similarity in content and structure using relational database
Proceedings of the 18th ACM conference on Information and knowledge management
XML-SIM: Structure and Content Semantic Similarity Detection Using Keys
OTM '09 Proceedings of the Confederated International Conferences, CoopIS, DOA, IS, and ODBASE 2009 on On the Move to Meaningful Internet Systems: Part II
LAX: an efficient approximate XML join based on clustered leaf nodes for XML data integration
BNCOD'05 Proceedings of the 22nd British National conference on Databases: enterprise, Skills and Innovation
Comparing XML files with a DOGMA ontology to generate Ω-RIDL annotations
OTM'11 Proceedings of the 2011th Confederated international conference on On the move to meaningful internet systems
A change detection system for unordered XML data using a relational model
Data & Knowledge Engineering
Hi-index | 0.00 |
XML documents from different sources may represent the same or similar information with respect to content and structure. Being able to integrate similar XML documents is important to query systems and search engines. However, information changes periodically, therefore, it is important to detect the changes among different versions of an XML document and use the changed information to discover semantic similarity among XML documents. In this paper, we introduce such an approach to detect XML similarity using the change detection mechanism to join XML document versions. In our approach, keys in subtrees play an important role in order to avoid unnecessary comparisons of subtrees within different XML versions of the same document. We use relational database to store XML versions and apply SQL for detecting similarities. We show that our approach is highly scalable and has better efficiency in terms of execution time and provides comparable result quality.