Improving XML instances comparison with preprocessing algorithms

Authors:
Rodrigo Gonçalves;Ronaldo dos Santos Mello
Affiliations:
Universidade Federal de Santa Catarina, Santa Catarina, Brazil;Universidade Federal de Santa Catarina, Santa Catarina, Brazil
Venue:
DEXA'07 Proceedings of the 18th international conference on Database and Expert Systems Applications
Year:
2007

Citing 13
Cited 0

Fast algorithms for the unit cost editing distance between trees

Journal of Algorithms
Intelligent integration of information

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Tree pattern matching

Pattern matching algorithms
The Tree-to-Tree Correction Problem

Journal of the ACM (JACM)
NiagaraCQ: a scalable continuous query system for Internet databases

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
A guided tour to approximate string matching

ACM Computing Surveys (CSUR)
A System for Approximate Tree Matching

IEEE Transactions on Knowledge and Data Engineering
Change-Centric Management of Versions in an XML Warehouse

Proceedings of the 27th International Conference on Very Large Data Bases
Answering XML Queries on Heterogeneous Data Sources

Proceedings of the 27th International Conference on Very Large Data Bases
On the Resemblance and Containment of Documents

SEQUENCES '97 Proceedings of the Compression and Complexity of Sequences 1997
Finding similar identities among objects from multiple web sources

WIDM '03 Proceedings of the 5th ACM international workshop on Web information and data management
Detecting duplicate objects in XML documents

Proceedings of the 2004 international workshop on Information quality in information systems
Fast Detection of XML Structural Similarity

IEEE Transactions on Knowledge and Data Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data instances integration, specially on the web, involves analyzing and matching data from two or more sources, including XML sources. XML sources, in particular, introduce new challenges to the integration process, given their dynamic and irregular structure. In this context, one of the hardest steps is to find out which XML instances are similar. This paper presents a group of algorithms to prepare XML instances for comparison. We analyse the benefit of these algorithms over existing XML comparison approaches.