On the editing distance between unordered labeled trees
Information Processing Letters
Change detection in hierarchically structured information
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Meaningful change detection in structured data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
WebCQ-detecting and delivering information changes on the web
Proceedings of the ninth international conference on Information and knowledge management
XRel: a path-based approach to storage and retrieval of XML documents using relational databases
ACM Transactions on Internet Technology (TOIT)
Detecting and Representing Relevant Web Deltas in WHOWEDA
IEEE Transactions on Knowledge and Data Engineering
Change-Centric Management of Versions in an XML Warehouse
Proceedings of the 27th International Conference on Very Large Data Bases
Detecting Changes in XML Documents
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
CX-DIFF: a change detection algorithm for XML content and change visualization for WebVigiL
Data & Knowledge Engineering - Special issue: XML schema and data management
Using a relational database for scalable XML search
The Journal of Supercomputing
Efficient change control of XML documents
Proceedings of the 9th ACM symposium on Document engineering
WebVigiL: user profile-based change detection for HTML/XML documents
BNCOD'03 Proceedings of the 20th British national conference on Databases
XML-SIM-CHANGE: structure and content semantic similarity detection among XML document versions
OTM'10 Proceedings of the 2010 international conference on On the move to meaningful internet systems: Part II
XML data clustering: An overview
ACM Computing Surveys (CSUR)
XANDY: detecting changes on large unordered XML documents using relationalDatabases
DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications
Hi-index | 0.01 |
The dramatic increase in the evolution of XML data available on the Internet requires a change detection system to keep track of important changes occurring during their life time. In this paper, we introduce a novel approach of detecting changes between two versions of unordered XML data stored in a traditional relational database using approaches like XRel. Most of the existing work in the area of XML change detection is mainly focused on detecting changes between two versions of XML data by constructing their Document Object Model (DOM) trees and then comparing these two tree structures based on Longest Common Sequence (LCS) using minimum edit distances. The basic tree comparison approach is not efficient in handling large XML files due to the fact that (1) an equivalent XML DOM tree will be twice as large as the original document and (2) the entire trees of both versions have to be memory resident during the comparison process. These two issues are constrained by the available main memory. In addition, existing approaches fail to detect changes among versions of XML data stored in relational databases as reverse mapping is not loss-less. We propose an efficient algorithm (XRel_Change_SQL) for detecting unordered changes between two XML data files stored in XRel as the underlying relational data model, using Structured Query Language (SQL). We compare the efficiency and quality of our change detection algorithm with existing XML change detection tools like X-Diff, DeltaXML and XANDY. We provide an experimental evaluation of the results obtained from the benchmark datasets as well as some synthetic datasets to show that our approach is highly scalable, and results in a much better efficiency and delta quality than the aforementioned approaches and tools.