Tracing the lineage of view data in a warehousing environment
ACM Transactions on Database Systems (TODS)
Tracing lineage beyond relational operators
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Provenance in Databases: Why, How, and Where
Foundations and Trends in Databases
TRAMP: understanding the behavior of schema mappings through provenance
Proceedings of the VLDB Endowment
The Foundations for Provenance on the Web
Foundations and Trends in Web Science
Tracing conceptual models' evolution in data warehouses by using the model driven architecture
Computer Standards & Interfaces
Hi-index | 0.00 |
A data warehousing system collects data from multiple distributed sources and stores the integrated information as materialized views in a local data warehouse. Users then perform data analysis and mining on the warehouse views. In many cases, the warehouse view contents alone are not sufficient for in-depth analysis.It is often useful to be able to "drill through" from interesting (or potentially erroneous) view data to the original source data that derived the view data. For a given view data item, identifying the exact set of base data items that produced the view data item is termed the view data lineage problem.Motivation for and applications of lineage tracing in a warehousing environment are provided in [2, 3]. In the context of the WHIPS data warehousing project at Stanford [4], we have developed a system that performs efficient and consistent lineage tracing. Some commercial data warehousing systems support schema-level lineage tracing, or provide specialized drill-down and/or drill-through facilities for multi-dimensional warehouse views.Our lineage tracing system supports more fine-grained instance-level lineage tracing for arbitrarily complex relational views, including aggregation. At view definition time, our system automatically generates lineage tracing procedures and supporting auxiliary views. At lineage tracing time, the system applies the tracing procedures to the source tables and/or auxiliary views to obtain the lineage results and to illustrate the specific view data derivation process.