Query processing techniques for arrays
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Guidelines for using multiple views in information visualization
AVI '00 Proceedings of the working conference on Advanced visual interfaces
Tracing the lineage of view data in a warehousing environment
ACM Transactions on Database Systems (TODS)
Extensible Parallel Query Processing for Exploratory Geoscientific Data Mining
Data Mining and Knowledge Discovery
Journal of Intelligent Information Systems
Why and Where: A Characterization of Data Provenance
ICDT '01 Proceedings of the 8th International Conference on Database Theory
Lineage Tracing for General Data Warehouse Transformations
Proceedings of the 27th International Conference on Very Large Data Bases
Tracing Data Lineage Using Automed Schema Transformation Pathways
BNCOD 19 Proceedings of the 19th British National Conference on Databases: Advances in Databases
Data Provenance: Some Basic Issues
FST TCS 2000 Proceedings of the 20th Conference on Foundations of Software Technology and Theoretical Computer Science
Lineage tracing for general data warehouse transformations
The VLDB Journal — The International Journal on Very Large Data Bases
Using AutoMed metadata in data warehousing environments
DOLAP '03 Proceedings of the 6th ACM international workshop on Data warehousing and OLAP
K2/Kleisli and GUS: experiments in integrated access to genomic data sources
IBM Systems Journal - Deep computing for the life sciences
Lineage retrieval for scientific data processing: a survey
ACM Computing Surveys (CSUR)
A survey of data provenance in e-science
ACM SIGMOD Record
Provenance management in curated databases
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Making database systems usable
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Tracing lineage beyond relational operators
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Inverse functions in the AquaLogic Data Services Platform
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
On the expressiveness of implicit provenance in query and update languages
ACM Transactions on Database Systems (TODS)
On the provenance of non-answers to queries over extracted data
Proceedings of the VLDB Endowment
A formal model of provenance in distributed systems
TAPP'09 First workshop on on Theory and practice of provenance
Generating example data for dataflow programs
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Towards Semantic Wikis: Modelling Intensions, Topics, and Origin in Content Management Systems
Proceedings of the 2009 conference on Information Modelling and Knowledge Bases XX
Provenance in Databases: Why, How, and Where
Foundations and Trends in Databases
Empowering Provenance in Data Integration
ADBIS '09 Proceedings of the 13th East European Conference on Advances in Databases and Information Systems
Understanding provenance black boxes
Distributed and Parallel Databases
Fine-grained and efficient lineage querying of collection-based workflow provenance
Proceedings of the 13th International Conference on Extending Database Technology
Data genome: an abstract model for data evolution
ISICA'07 Proceedings of the 2nd international conference on Advances in computation and intelligence
I4E: interactive investigation of iterative information extraction
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Towards a secure and efficient system for end-to-end provenance
TAPP'10 Proceedings of the 2nd conference on Theory and practice of provenance
PinDr0p: using single-ended audio features to determine call provenance
Proceedings of the 17th ACM conference on Computer and communications security
The Foundations for Provenance on the Web
Foundations and Trends in Web Science
W3P: Building an OPM based provenance model for the Web
Future Generation Computer Systems
Complementing data in the ETL process
DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
Query language constructs for provenance
Proceedings of the 15th Symposium on International Database Engineering & Applications
A scientific workflow framework integrated with object deputy model for data provenance
WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
Provenance as dependency analysis
Mathematical Structures in Computer Science - Programming Language Interference and Dependence
Using schema transformation pathways for data lineage tracing
BNCOD'05 Proceedings of the 22nd British National conference on Databases: enterprise, Skills and Innovation
Data cleaning and transformation using the AJAX framework
GTTSE'05 Proceedings of the 2005 international conference on Generative and Transformational Techniques in Software Engineering
Enabling provenance on large scale e-science applications
IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
Applying the virtual data provenance model
IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
Exploring provenance in a distributed job execution system
IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
Preventing recommendation attack in trust-based recommender systems
Journal of Computer Science and Technology - Special issue on Community Analysis and Information Recommendation
Efficient provenance storage for relational queries
Proceedings of the 21st ACM international conference on Information and knowledge management
A comprehensive model for provenance
ER'12 Proceedings of the 2012 international conference on Advances in Conceptual Modeling
International Journal of Systems and Service-Oriented Engineering
Ariadne: managing fine-grained provenance on data streams
Proceedings of the 7th ACM international conference on Distributed event-based systems
BNCOD'13 Proceedings of the 29th British National conference on Big Data
Hi-index | 0.00 |
The lineage of a datum records its processing history. Because such information can be used to trace the source of anomalies and errors in processed data sets, it is valuable to users for a variety of applications, including the investigation of anomalies and debugging. Traditional data lineage approaches rely on metadata. However, metadata does not scale well to fine-grained lineage, especially in large data sets. For example, it is not feasible to store all of the information that is necessary to trace from a specific floating-point value in a processed data set to a particular satellite image pixel in a source data set. In this paper, we propose a novel method to support fine-grained data lineage. Rather than relying on metadata, our approach lazily computes the lineage using a limited amount of information about the processing operators and the base data. We introduce the notions of weak inversion and verification. While our system does not perfectly invert the data, it uses weak inversion and verification to provide a number of guarantees about the lineage it generates. We propose a design for the implementation of weak inversion and verification in an object-relational database management system.