Communications of the ACM
Achieving adaptivity for OLAP-XML federations
DOLAP '03 Proceedings of the 6th ACM international workshop on Data warehousing and OLAP
BSML: A binding schema markup language for data interchange in problem solving environments
Scientific Programming
Hi-index | 0.02 |
Data stored in a data warehouse must be kept consistent and up-to-date with respect to the underlying information sources. By providing the capability to identify, categorize and detect changes in these sources, only the modified data needs to be transfered and entered into the warehouse. Another alternative, periodically reloading from scratch, is obviously inefficient. When the schema of an information source changes, all components that interact with, or make use of, data originating from that source must be updated to conform. The change detection problem is the problem of detecting data and schema changes by comparing two versions of the same semi-structured document. In this paper, we present an approach to detecting data and schema changes for scientific documents. Scientific data is of particular interest because it is normally stored as semi-structured document, and suffers frequent schema updates. This paper demonstrates the use of graph to represent scientific documents in particular and semi-structured documents in general as well as their schema. It also demonstrates an approach to efficiently detect data and schema changes by merging the detection with parsing the document.