Why and Where: A Characterization of Data Provenance

Authors:
Peter Buneman;Sanjeev Khanna;Wang Chiew Tan
Affiliations:
-;-;-
Venue:
ICDT '01 Proceedings of the 8th International Conference on Database Theory
Year:
2001

Citing 10
Cited 123

On conjunctive queries containing inequalities

Journal of the ACM (JACM)
Normal forms and conservative properties for query languages over collection types

PODS '93 Proceedings of the twelfth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
View maintenance in a warehousing environment

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
A query language and optimization techniques for unstructured data

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
A query language for XML

WWW '99 Proceedings of the eighth international conference on World Wide Web
Data on the Web: from relations to semistructured data and XML

Data on the Web: from relations to semistructured data and XML
Foundations of Databases: The Logical Level

Foundations of Databases: The Logical Level
Supporting Fine-grained Data Lineage in a Database Visualization Environment

ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Object Exchange Across Heterogeneous Information Sources

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Practical Lineage Tracing in Data Warehouses

ICDE '00 Proceedings of the 16th International Conference on Data Engineering

Tracing Data Lineage Using Automed Schema Transformation Pathways

BNCOD 19 Proceedings of the 19th British National Conference on Databases: Advances in Databases
Data Provenance: Some Basic Issues

FST TCS 2000 Proceedings of the 20th Conference on Foundations of Software Technology and Theoretical Computer Science
Representing and Querying Data Transformations

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Provenance in Agent-Mediated Healthcare Systems

IEEE Intelligent Systems
Making database systems usable

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Weaving temporal and reliability aspects into a schema tapestry

Data & Knowledge Engineering
Review Article: Workflow based framework for life science informatics

Computational Biology and Chemistry
Visual Analytics: Scope and Challenges

Visual Data Mining
A Dataflow-Oriented Atomicity and Provenance System for Pipelined Scientific Workflows

ICCS '07 Proceedings of the 7th international conference on Computational Science, Part III: ICCS 2007
ViP: A User-Centric View-Based Annotation Framework for Scientific Data

SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
An Approach to Evaluate Data Trustworthiness Based on Data Provenance

SDM '08 Proceedings of the 5th VLDB workshop on Secure Data Management
Data Lineage Model for Taverna Workflows with Lightweight Annotation Requirements

Provenance and Annotation of Data and Processes
A Logic Programming Approach to Scientific Workflow Provenance Querying

Provenance and Annotation of Data and Processes
Provenance and the Price of Identity

Provenance and Annotation of Data and Processes
The Challenge of Assuring Data Trustworthiness

DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Data Integration and Exchange for Scientific Collaboration

DILS '09 Proceedings of the 6th International Workshop on Data Integration in the Life Sciences
Empowering Provenance in Data Integration

ADBIS '09 Proceedings of the 13th East European Conference on Advances in Databases and Information Systems
An Access Control Language for a General Provenance Model

SDM '09 Proceedings of the 6th VLDB Workshop on Secure Data Management
Ontology-Driven Provenance Management in eScience: An Application in Parasite Research

OTM '09 Proceedings of the Confederated International Conferences, CoopIS, DOA, IS, and ODBASE 2009 on On the Move to Meaningful Internet Systems: Part II
An optimized two-step solution for updating XML views

DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
On Detecting Data Flow Errors in Workflows

Journal of Data and Information Quality (JDIQ)
Facilitating fine grained data provenance using temporal data model

Proceedings of the Seventh International Workshop on Data Management for Sensor Networks
Development of foundation models for Internet of Things

Frontiers of Computer Science in China
DBTaint: cross-application information flow tracking via databases

WebApps'10 Proceedings of the 2010 USENIX conference on Web application development
Distance makes the types grow stronger: a calculus for differential privacy

Proceedings of the 15th ACM SIGPLAN international conference on Functional programming
PinDr0p: using single-ended audio features to determine call provenance

Proceedings of the 17th ACM conference on Computer and communications security
Towards a data-centric view of cloud security

CloudDB '10 Proceedings of the second international workshop on Cloud data management
Document provenance in the cloud: constraints and challenges

EUNICE'10 Proceedings of the 16th EUNICE/IFIP WG 6.6 conference on Networked services and applications: engineering, control and management
Preserving integrity and confidentiality of a directed acyclic graph model of provenance

DBSec'10 Proceedings of the 24th annual IFIP WG 11.3 working conference on Data and applications security and privacy
The complexity of causality and responsibility for query answers and non-answers

Proceedings of the VLDB Endowment
Exploiting conflict structures in inconsistent databases

ADBIS'10 Proceedings of the 14th east European conference on Advances in databases and information systems
A data-centric approach to insider attack detection in database systems

RAID'10 Proceedings of the 13th international conference on Recent advances in intrusion detection
Policy-based management and sharing of sensitive information among government agencies

MILCOM'06 Proceedings of the 2006 IEEE conference on Military communications
The Foundations for Provenance on the Web

Foundations and Trends in Web Science
Multi-level monitoring and analysis of web-scale service based applications

ICSOC/ServiceWave'09 Proceedings of the 2009 international conference on Service-oriented computing
Managing lineage and uncertainty under a data exchange setting

SUM'10 Proceedings of the 4th international conference on Scalable uncertainty management
Non-interactive editable signatures for assured data provenance

Proceedings of the first ACM conference on Data and application security and privacy
Towards a generic infrastructure for sustainable management of quality controlled primary data

OTM'10 Proceedings of the 2010 international conference on On the move to meaningful internet systems
Cross-application data provenance and policy enforcement

ACM Transactions on Information and System Security (TISSEC)
Is provenance logical?

Proceedings of the 4th International Workshop on Logic in Databases
W3P: Building an OPM based provenance model for the Web

Future Generation Computer Systems
Special Section: The third provenance challenge on using the open provenance model for interoperability

Future Generation Computer Systems
Linked provenance data: A semantic Web-based approach to interoperable workflow traces

Future Generation Computer Systems
Database-centric chain-of-custody in biometric forensic systems

BioID'11 Proceedings of the COST 2101 European conference on Biometrics and ID management
Tracing the provenance of linked data using voiD

Proceedings of the International Conference on Web Intelligence, Mining and Semantics
A quest for beauty and wealth (or, business processes for database researchers)

Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Provenance for aggregate queries

Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
On provenance minimization

Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Tracing data errors with view-conditioned causality

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Sensitivity analysis and explanations for robust query evaluation in probabilistic databases

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
PrIMe: A methodology for developing provenance-aware applications

ACM Transactions on Software Engineering and Methodology (TOSEM)
Explaining accesses to electronic health records

Proceedings of the 2011 workshop on Data mining for medicine and healthcare
Propagation of multi-granularity annotations

DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part II
Complementing data in the ETL process

DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
SciProv: an architecture for semantic query in provenance metadata on e-science context

ITBAM'11 Proceedings of the Second international conference on Information technology in bio- and medical informatics
Secure network provenance

SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Explanation-based auditing

Proceedings of the VLDB Endowment
Prefix-based node numbering for temporal XML

WISE'11 Proceedings of the 12th international conference on Web information system engineering
Provenance-based refresh in data-oriented workflows

Proceedings of the 20th ACM international conference on Information and knowledge management
Reliable provenance information for multimedia data using invertible fragile watermarks

BNCOD'11 Proceedings of the 28th British national conference on Advances in databases
On the expressiveness of implicit provenance in query and update languages

ICDT'07 Proceedings of the 11th international conference on Database Theory
Putting lipstick on pig: enabling database-style workflow provenance

Proceedings of the VLDB Endowment
Towards automatic generation of semantic types in scientific workflows

WISE'05 Proceedings of the 2005 international conference on Web Information Systems Engineering
Towards a model of provenance and user views in scientific workflows

DILS'06 Proceedings of the Third international conference on Data Integration in the Life Sciences
Explaining conclusions from diverse knowledge sources

ISWC'06 Proceedings of the 5th international conference on The Semantic Web
A scientific workflow framework integrated with object deputy model for data provenance

WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
On the use of semantic annotations for supporting provenance in grids

Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Provenance as dependency analysis

Mathematical Structures in Computer Science - Programming Language Interference and Dependence
Data semantics revisited

SWDB'04 Proceedings of the Second international conference on Semantic Web and Databases
A protocol for recording provenance in service-oriented grids

OPODIS'04 Proceedings of the 8th international conference on Principles of Distributed Systems
Using schema transformation pathways for data lineage tracing

BNCOD'05 Proceedings of the 22nd British National conference on Databases: enterprise, Skills and Innovation
A user-centric framework for accessing biological sources and tools

DILS'05 Proceedings of the Second international conference on Data Integration in the Life Sciences
A system architecture as a support to a flexible annotation service

DELOS'04 Proceedings of the 6th Thematic conference on Peer-to-Peer, Grid, and Service-Orientation in Digital Library Architectures
Data cleaning and transformation using the AJAX framework

GTTSE'05 Proceedings of the 2005 international conference on Generative and Transformational Techniques in Software Engineering
Applying provenance in distributed organ transplant management

IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
Provenance collection support in the kepler scientific workflow system

IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
A model for user-oriented data provenance in pipelined scientific workflows

IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
Applying the virtual data provenance model

IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
A provenance model for manually curated data

IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
Electronically querying for the provenance of entities

IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
Exploring provenance in a distributed job execution system

IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
Schema-mediated exchange of temporal XML data

ER'06 Proceedings of the 25th international conference on Conceptual Modeling
Models for incomplete and probabilistic information

EDBT'06 Proceedings of the 2006 international conference on Current Trends in Database Technology
A calculus for propagating semantic annotations through scientific workflow queries

EDBT'06 Proceedings of the 2006 international conference on Current Trends in Database Technology
Securing data provenance in the cloud

iNetSec'11 Proceedings of the 2011 IFIP WG 11.4 international conference on Open Problems in Network Security
Quality-aware service-oriented data integration: requirements, state of the art and open challenges

ACM SIGMOD Record
Classification of annotation semirings over query containment

PODS '12 Proceedings of the 31st symposium on Principles of Database Systems
Mob data sourcing

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
A core calculus for provenance

POST'12 Proceedings of the First international conference on Principles of Security and Trust
Combining dependent annotations for relational algebra

Proceedings of the 15th International Conference on Database Theory
Toward provenance-based security for configuration languages

TaPP'12 Proceedings of the 4th USENIX conference on Theory and Practice of Provenance
Provenance management in databases under schema evolution

TaPP'12 Proceedings of the 4th USENIX conference on Theory and Practice of Provenance
Datalog as a lingua franca for provenance querying and reasoning

TaPP'12 Proceedings of the 4th USENIX conference on Theory and Practice of Provenance
Fine-grained provenance inference for a large processing chain with non-materialized intermediate views

SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
Functional programs that explain their work

Proceedings of the 17th ACM SIGPLAN international conference on Functional programming
Semiring-annotated data: queries and provenance?

ACM SIGMOD Record
Leakage in data mining: Formulation, detection, and avoidance

ACM Transactions on Knowledge Discovery from Data (TKDD) - Special Issue on the Best of SIGKDD 2011
On Provenance Minimization

ACM Transactions on Database Systems (TODS)
Improving the maintainability of data warehouse designs: modeling relationships between sources and user concepts

Proceedings of the fifteenth international workshop on Data warehousing and OLAP
Tracing where and who provenance in Linked Data: A calculus

Theoretical Computer Science
Efficient provenance storage for relational queries

Proceedings of the 21st ACM international conference on Information and knowledge management
SourceTrac: tracing data sources within spreadsheets

IPAW'12 Proceedings of the 4th international conference on Provenance and Annotation of Data and Processes
Declarative secure distributed information systems

Computer Languages, Systems and Structures
Supporting database provenance under schema evolution

ER'12 Proceedings of the 2012 international conference on Advances in Conceptual Modeling
A file provenance system

Proceedings of the third ACM conference on Data and application security and privacy
Distributed time-aware provenance

Proceedings of the VLDB Endowment
The W3C PROV family of specifications for modelling provenance metadata

Proceedings of the 16th International Conference on Extending Database Technology
Provenance from log files: a BigData problem

Proceedings of the Joint EDBT/ICDT 2013 Workshops
Reconstructing the software environment of an experiment with kameleon

Proceedings of the 5th ACM COMPUTE Conference: Intelligent & scalable system technologies
Attributing authorship of revisioned content

Proceedings of the 22nd international conference on World Wide Web
Performance evaluation of parallel strategies in public clouds: A study with phylogenomic workflows

Future Generation Computer Systems
A middleware framework for urban data management

Proceedings of the 2013 ACM conference on Pervasive and ubiquitous computing adjunct publication
Collaborative data sharing via update exchange and provenance

ACM Transactions on Database Systems (TODS)
Seeking provenance of information using social media

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
The providence of provenance

BNCOD'13 Proceedings of the 29th British National conference on Big Data
Static compiler analysis for workflow provenance

WORKS '13 Proceedings of the 8th Workshop on Workflows in Support of Large-Scale Science
Answering why-not queries in software-defined networks with negative provenance

Proceedings of the Twelfth ACM Workshop on Hot Topics in Networks
Information behaving badly

Proceedings of the 2013 workshop on New security paradigms workshop
Classification of annotation semirings over containment of conjunctive queries

ACM Transactions on Database Systems (TODS)
On quantitative dynamic data flow tracking

Proceedings of the 4th ACM conference on Data and application security and privacy
A reliable scheduling method in equipment grid using provenance information

Information Systems Frontiers
Implementing interoperable provenance in biomedical research

Future Generation Computer Systems
A core calculus for provenance

Journal of Computer Security - Security and Trust Principles

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the proliferation of database views and curated databases, the issue of data provenance - where a piece of data came from and the process by which it arrived in the database - is becoming increasingly important, especially in scientific databases where understanding provenance is crucial to the accuracy and currency of data. In this paper we describe an approach to computing provenance when the data of interest has been created by a database query. We adopt a syntactic approach and present results for a general data model that applies to relational databases as well as to hierarchical data such as XML. A novel aspect of our work is a distinction between "why" provenance (refers to the source data that had some influence on the existence of the data) and "where" provenance (refers to the location(s) in the source databases from which the data was extracted).