On provenance and privacy

Authors:
Susan B. Davidson;Sanjeev Khanna;Sudeepa Roy;Julia Stoyanovich;Val Tannen;Yi Chen
Affiliations:
University of Pennsylvania, Philadelphia;University of Pennsylvania, Philadelphia;University of Pennsylvania, Philadelphia;University of Pennsylvania, Philadelphia;University of Pennsylvania, Philadelphia;Arizona State University, Tempe
Venue:
Proceedings of the 14th International Conference on Database Theory
Year:
2011

Citing 35
Cited 5

A fine-grained access control system for XML documents

ACM Transactions on Information and System Security (TISSEC)
Secure and selective dissemination of XML documents

ACM Transactions on Information and System Security (TISSEC)
Database Management Systems

Database Management Systems
Revealing information while preserving privacy

Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
k-anonymity: a model for protecting privacy

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
DBXplorer: A System for Keyword-Based Search over Relational Databases

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
State-of-the-art in privacy preserving data mining

ACM SIGMOD Record
A formal analysis of information disclosure in data exchange

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Secure XML querying with security views

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Taverna: a tool for the composition and enactment of bioinformatics workflows

Bioinformatics
Achieving anonymity via clustering

Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
L-diversity: Privacy beyond k-anonymity

ACM Transactions on Knowledge Discovery from Data (TKDD)
Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography

Proceedings of the 16th international conference on World Wide Web
Identifying meaningful return information for XML keyword search

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Introducing secure provenance: problems and challenges

Proceedings of the 2007 ACM workshop on Storage security and survivability
Monitoring business processes with queries

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
MutaGeneSys

Bioinformatics
Privacy-Preserving Data Mining: Models and Algorithms

Privacy-Preserving Data Mining: Models and Algorithms
Scientific Workflow Provenance Querying with Security Views

WAIM '08 Proceedings of the 2008 The Ninth International Conference on Web-Age Information Management
The Open Provenance Model: An Overview

Provenance and Annotation of Data and Processes
Securing provenance

HOTSEC'08 Proceedings of the 3rd conference on Hot topics in security
Optimizing user views for workflows

Proceedings of the 12th International Conference on Database Theory
The Differential Privacy Frontier (Extended Abstract)

TCC '09 Proceedings of the 6th Theory of Cryptography Conference on Theory of Cryptography
Auditing SQL Queries

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Querying and Managing Provenance through User Views in Scientific Workflows

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Relationship privacy: output perturbation for queries with joins

Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Detecting and resolving unsound workflow views for correct provenance analysis

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Privacy by design: moving from art to practice

Communications of the ACM
Differential privacy: a survey of results

TAMC'08 Proceedings of the 5th international conference on Theory and applications of models of computation
Querying data provenance

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Privacy issues in scientific workflow provenance

Proceedings of the 1st International Workshop on Workflow Approaches to New Data-centric Science
Trusted computing and provenance: better together

TAPP'10 Proceedings of the 2nd conference on Theory and practice of provenance
Searching workflows with hierarchical views

Proceedings of the VLDB Endowment
Actor-oriented design of scientific workflows

ER'05 Proceedings of the 24th international conference on Conceptual Modeling
Managing rapidly-evolving scientific workflows

IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data

Securing data provenance in the cloud

iNetSec'11 Proceedings of the 2011 IFIP WG 11.4 international conference on Open Problems in Network Security
Labeling workflow views with fine-grained dependencies

Proceedings of the VLDB Endowment
A propagation model for provenance views of public/private workflows

Proceedings of the 16th International Conference on Database Theory
Towards privacy-preserving fault detection

Proceedings of the 9th Workshop on Hot Topics in Dependable Systems
The providence of provenance

BNCOD'13 Proceedings of the 29th British National conference on Big Data

Quantified Score

Hi-index	0.00

Visualization

Abstract

Provenance in scientific workflows is a double-edged sword. On the one hand, recording information about the module executions used to produce a data item, as well as the parameter settings and intermediate data items passed between module executions, enables transparency and reproducibility of results. On the other hand, a scientific workflow often contains private or confidential data and uses proprietary modules. Hence, providing exact answers to provenance queries over all executions of the workflow may reveal private information. In this paper we discuss privacy concerns in scientific workflows -- data, module, and structural privacy - and frame several natural questions: (i) Can we formally analyze data, module, and structural privacy, giving provable privacy guarantees for an unlimited/bounded number of provenance queries? (ii) How can we answer search and structural queries over repositories of workflow specifications and their executions, providing as much information as possible to the user while still guaranteeing privacy? We then highlight some recent work in this area and point to several directions for future work.