Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Detecting and Representing Relevant Web Deltas in WHOWEDA
IEEE Transactions on Knowledge and Data Engineering
Algorithms for Temporal Query Operators in XML Databases
EDBT '02 Proceedings of the Worshops XMLDM, MDDE, and YRWS on XML-Based Data Management and Multimedia Engineering-Revised Papers
Conversation specification: a new approach to design and analysis of e-service composition
WWW '03 Proceedings of the 12th international conference on World Wide Web
Differences between versions of UML diagrams
Proceedings of the 9th European software engineering conference held jointly with 11th ACM SIGSOFT international symposium on Foundations of software engineering
Achieving adaptivity for OLAP-XML federations
DOLAP '03 Proceedings of the 6th ACM international workshop on Data warehousing and OLAP
Extending xQuery with transformation operators
Proceedings of the 2003 ACM symposium on Document engineering
On merging structured documents with move operation
ISICT '03 Proceedings of the 1st international symposium on Information and communication technologies
Managing versions of web documents in a transaction-time web server
Proceedings of the 13th international conference on World Wide Web
An Efficient Algorithm to Compute Differences between Structured Documents
IEEE Transactions on Knowledge and Data Engineering
Discovering frequently changing structures from historical structural deltas of unordered XML
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Fast Detection of XML Structural Similarity
IEEE Transactions on Knowledge and Data Engineering
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
CX-DIFF: a change detection algorithm for XML content and change visualization for WebVigiL
Data & Knowledge Engineering - Special issue: XML schema and data management
Finding an optimum edit script between an XML document and a DTD
Proceedings of the 2005 ACM symposium on Applied computing
Automatic Fragment Detection in Dynamic Web Pages and Its Impact on Caching
IEEE Transactions on Knowledge and Data Engineering
Sync your data: update propagation for heterogeneous protein databases
The VLDB Journal — The International Journal on Very Large Data Bases
Towards XML version control of office documents
Proceedings of the 2005 ACM symposium on Document engineering
Kalchas: a dynamic XML search engine
Proceedings of the 14th ACM international conference on Information and knowledge management
diffX: an algorithm to detect changes in multi-version XML documents
CASCON '05 Proceedings of the 2005 conference of the Centre for Advanced Studies on Collaborative research
Granularity reduction in temporal document databases
Information Systems
Integrating XML data sources using approximate joins
ACM Transactions on Database Systems (TODS)
Model comparison: a foundation for model composition and model transformation testing
Proceedings of the 2006 international workshop on Global integrated model management
A methodology for clustering XML documents by structure
Information Systems
A dataflow approach to efficient change detection of HTML/XML documents in WebVigiL
Computer Networks: The International Journal of Computer and Telecommunications Networking - Web dynamics
An incrementally maintainable index for approximate lookups in hierarchical data
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Fast and simple XML tree differencing by sequence alignment
Proceedings of the 2006 ACM symposium on Document engineering
A lightweight approach to transparent sharing of familiar single-user editors
CSCW '06 Proceedings of the 2006 20th anniversary conference on Computer supported cooperative work
Event trace independence of active behavior
Information Processing Letters
Exploiting structural similarity for effective Web information extraction
Data & Knowledge Engineering
Process based storing and reconstructing of XML form documents
Computers in Industry
Data & Knowledge Engineering - Special issue: WIDM 2004
XANDY: a scalable change detection technique for ordered XML documents using relational databases
Data & Knowledge Engineering - Special issue: WIDM 2004
XML structural delta mining: issues and challenges
Data & Knowledge Engineering - Special issue: ER 2003
An Efficient Web Page Change Detection System Based on an Optimized Hungarian Algorithm
IEEE Transactions on Knowledge and Data Engineering
Proceedings of the 2007 ACM symposium on Document engineering
Smart bookmarks: automatic retroactive macro recording on the web
Proceedings of the 20th annual ACM symposium on User interface software and technology
(Semantic web) evolution through change logs: problems and solutions
AIAP'07 Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: artificial intelligence and applications
Temporal slicing in the evaluation of XML queries
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
An approach to XML path matching
Proceedings of the 9th annual ACM international workshop on Web information and data management
Nugget discovery in visual exploration environments by query consolidation
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Measuring the structural similarity among XML documents and DTDs
Journal of Intelligent Information Systems
Managing discoveries in the visual analytics process
ACM SIGKDD Explorations Newsletter - Special issue on visual analytics
Architecture for personal digital library
ICCOMP'05 Proceedings of the 9th WSEAS International Conference on Computers
Computing structural similarity of source XML schemas against domain XML schema
ADC '08 Proceedings of the nineteenth conference on Australasian database - Volume 75
Data & Knowledge Engineering
The Active XML project: an overview
The VLDB Journal — The International Journal on Very Large Data Bases
Merging changes in XML documents using reliable context fingerprints
Proceedings of the eighth ACM symposium on Document engineering
Similarity of XML schema definitions
Proceedings of the eighth ACM symposium on Document engineering
A Hybrid Approach for XML Similarity
SOFSEM '07 Proceedings of the 33rd conference on Current Trends in Theory and Practice of Computer Science
A Comparative Evaluation of XML Difference Algorithms with Genomic Data
SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Peer-to-peer collaboration over XML documents
CDVE '08 Proceedings of the 5th international conference on Cooperative Design, Visualization, and Engineering
Efficient SOAP message exchange and evaluation through XML similarity
Proceedings of the 2008 ACM workshop on Secure web services
Parallel crawler architecture and web page change detection
WSEAS Transactions on Computers
Journal of Intelligent Information Systems
Partitioning methods for multi-version XML data warehouses
Distributed and Parallel Databases
Versioning XML-based office documents
Multimedia Tools and Applications
How to edit gigabyte XML files on a mobile phone with XAS, RefTrees, and RAXS
Proceedings of the 5th Annual International Conference on Mobile and Ubiquitous Systems: Computing, Networking, and Services
Efficient change control of XML documents
Proceedings of the 9th ACM symposium on Document engineering
Proceedings of the 13th International Conference on Human-Computer Interaction. Part IV: Interacting in Various Application Domains
A relational data harmonization approach to XML
Journal of Information Science
WSDL and UDDI extensions for version support in web services
Journal of Systems and Software
Extending the similarity-based XML multicast approach with digital signatures
Proceedings of the 2009 ACM workshop on Secure web services
The pq-gram distance between ordered labeled trees
ACM Transactions on Database Systems (TODS)
XML-SIM: Structure and Content Semantic Similarity Detection Using Keys
OTM '09 Proceedings of the Confederated International Conferences, CoopIS, DOA, IS, and ODBASE 2009 on On the Move to Meaningful Internet Systems: Part II
Extensible User-Based XML Grammar Matching
ER '09 Proceedings of the 28th International Conference on Conceptual Modeling
Event Trace Independence of active behavior
Information Processing Letters
Granularity reduction in temporal document databases
Information Systems
Using visual pages analysis for optimizing web archiving
Proceedings of the 2010 EDBT/ICDT Workshops
Managing an XML warehouse in a P2P context
CAiSE'03 Proceedings of the 15th international conference on Advanced information systems engineering
Efficient change detection in tree-structured data
HSI'03 Proceedings of the 2nd international conference on Human.society@internet
WebVigiL: user profile-based change detection for HTML/XML documents
BNCOD'03 Proceedings of the 20th British national conference on Databases
Constraint preserving XML updating
APWeb'03 Proceedings of the 5th Asia-Pacific web conference on Web technologies and applications
Structural similarity evaluation between XML documents and DTDs
WISE'07 Proceedings of the 8th international conference on Web information systems engineering
BioDIFF: an effective fast change detection algorithm for biological annotations
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
A fine-grained XML structural comparison approach
ER'07 Proceedings of the 26th international conference on Conceptual modeling
Semantics-guided clustering of heterogeneous XML schemas
Journal on data semantics IX
Storage techniques for multi-versioned XML documents
DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
Using versioned tree data structure, change detection and node identity for three-way XML merging
Proceedings of the 10th ACM symposium on Document engineering
Semantics-based change impact analysis for heterogeneous collections of documents
Proceedings of the 10th ACM symposium on Document engineering
Vi-DIFF: understanding web pages changes
DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part I
pq-hash: an efficient method for approximate XML joins
WAIM'10 Proceedings of the 2010 international conference on Web-age information management
Dynamic reasoning on XML updates
Proceedings of the 14th International Conference on Extending Database Technology
Proceedings of the 11th ACM symposium on Document engineering
Edit distance between XML and probabilistic XML documents
DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part I
Towards a version control model with uncertain data
Proceedings of the 4th workshop on Workshop for Ph.D. students in information & knowledge management
RTED: a robust algorithm for the tree edit distance
Proceedings of the VLDB Endowment
X-Tree diff+: efficient change detection algorithm in XML documents
EUC'06 Proceedings of the 2006 international conference on Embedded and Ubiquitous Computing
Algorithms for finding a most similar subforest
CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
A change detection system for unordered XML data using a relational model
Data & Knowledge Engineering
Incremental method for XML view maintenance in case of non monitored data sources
SOFSEM'06 Proceedings of the 32nd conference on Current Trends in Theory and Practice of Computer Science
Flexible collaboration over XML documents
CDVE'06 Proceedings of the Third international conference on Cooperative Design, Visualization, and Engineering
A tree comparison approach to detect changes in data warehouse structures
DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery
Supporting customised collaboration over shared document repositories
CAiSE'06 Proceedings of the 18th international conference on Advanced Information Systems Engineering
XSLTGen: a system for automatically generating XML transformations via semantic mappings
Journal on Data Semantics V
An ontology-guided approach to change detection of the semantic web data
Journal on Data Semantics V
Clustering XML documents using structural summaries
EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology
Web Semantics: Science, Services and Agents on the World Wide Web
Conflict resolution in updates through XML views
EDBT'06 Proceedings of the 2006 international conference on Current Trends in Database Technology
XCC: change control of XML documents
Computer Science - Research and Development
Measuring XML structured-ness with entropy
WAIM'11 Proceedings of the 2011 international conference on Web-Age Information Management
Survey: An overview on XML similarity: Background, current trends and future directions
Computer Science Review
Minimizing user effort in XML grammar matching
Information Sciences: an International Journal
Managing branch versioning in versioned/temporal XML documents
XSym'07 Proceedings of the 5th international conference on Database and XML Technologies
Guided forest edit distance: better structure comparisons by using domain-knowledge
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
E-Metadata versioning system for data warehouse schema
International Journal of Metadata, Semantics and Ontologies
SliceSort: efficient sorting of hierarchical data
Proceedings of the 21st ACM international conference on Information and knowledge management
Moving towards a collaborative decision support system for aeronautical data
Journal of Intelligent Manufacturing
XUTools: UNIX commands for processing next-generation structured text
lisa'12 Proceedings of the 26th international conference on Large Installation System Administration: strategies, tools, and techniques
Hierarchical clustering of XML documents focused on structural components
Data & Knowledge Engineering
Using XML-Based Multicasting to Improve Web Service Scalability
International Journal of Web Services Research
A data-driven approach toward building dynamic ontology
ICT-EurAsia'13 Proceedings of the 2013 international conference on Information and Communication Technology
Introduction to the universal delta model
Proceedings of the 2013 ACM symposium on Document engineering
RWS-Diff: flexible and efficient change detection in hierarchical data
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
E-Metadata versioning system for data warehouse schema
International Journal of Metadata, Semantics and Ontologies
Synthetising changes in XML documents as PULs
Proceedings of the VLDB Endowment
On repairing structural problems in semi-structured data
Proceedings of the VLDB Endowment
Temporal and multi-versioned XML documents: A survey
Information Processing and Management: an International Journal
Hi-index | 0.00 |
We present a diff algorithm for XML data. This work is motivated by the support for change control in the context of the Xyleme project that is investigating dynamic warehouses capable of storing massive volume of XML data. Because of the context, our algorithm has to be very efficient in terms of speed and memory space even at the cost of some loss of ``quality''. Also, it considers, besides insertions, deletions and updates (standard in diffs), a move operation on subtrees that is essential in the context of XML. Intuitively, our diff algorithm uses signatures to match (large) subtrees that were left unchanged between the old and new versions. Such exact matchings are then possibly propagated to ancestors and descendants to obtain more matchings. It also uses XML specific information such as ID attributes. We provide a performance analysis of the algorithm. We show that it runs in average in linear time vs. quadratic time for previous algorithms. We present experiments on synthetic data that confirm the analysis. Since this problem is NP-hard, the linear time is obtained by trading some quality. We present experiments (again on synthetic data) that show that the output of our algorithm is reasonably close to the ``optimal'' in terms of quality. Finally we present experiments on a small sample of XML pages found on the Web.