Theoretical Computer Science
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
XMill: an efficient compressor for XML data
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Supporting Fine-grained Data Lineage in a Database Visualization Environment
ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Lineage Tracing for General Data Warehouse Transformations
Proceedings of the 27th International Conference on Very Large Data Bases
Chimera: AVirtual Data System for Representing, Querying, and Automating Data Derivation
SSDBM '02 Proceedings of the 14th International Conference on Scientific and Statistical Database Management
Applying Chimera virtual data concepts to cluster finding in the Sloan Sky Survey
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
The VLDB Journal — The International Journal on Very Large Data Bases
Global common subexpression elimination
Proceedings of a symposium on Compiler optimization
XGRIND: A Query-Friendly XML Compressor
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Earth System Science Workbench: A Data Management Infrastructure for Earth Science Products
SSDBM '01 Proceedings of the 13th International Conference on Scientific and Statistical Database Management
Workflow Mining: Discovering Process Models from Event Logs
IEEE Transactions on Knowledge and Data Engineering
Provenance management in curated databases
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
ULDBs: databases with uncertainty and lineage
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
A Framework for Collecting Provenance in Data-Centric Scientific Workflows
ICWS '06 Proceedings of the IEEE International Conference on Web Services
Provenance-aware storage systems
ATEC '06 Proceedings of the annual conference on USENIX '06 Annual Technical Conference
Querying and Creating Visualizations by Analogy
IEEE Transactions on Visualization and Computer Graphics
An annotation management system for relational databases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Recording and using provenance in a protein compressibility experiment
HPDC '05 Proceedings of the High Performance Distributed Computing, 2005. HPDC-14. Proceedings. 14th IEEE International Symposium
Special Issue: The First Provenance Challenge
Concurrency and Computation: Practice & Experience - The First Provenance Challenge
Automatic capture and efficient storage of e-Science experiment provenance
Concurrency and Computation: Practice & Experience - The First Provenance Challenge
Metadata in the collaboratory for multi-scale chemical science
DCMI '03 Proceedings of the 2003 international conference on Dublin Core and metadata applications: supporting communities of discourse and practice---metadata research & applications
Project histories: managing data provenance across collection-oriented scientific workflow runs
DILS'07 Proceedings of the 4th international conference on Data integration in the life sciences
Towards a model of provenance and user views in scientific workflows
DILS'06 Proceedings of the Third international conference on Data Integration in the Life Sciences
Provenance and Annotation of Data and Processes
Efficient provenance storage over nested data collections
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Purple SOX extraction management system
ACM SIGMOD Record
The case of the fake Picasso: preventing history forgery with secure provenance
FAST '09 Proccedings of the 7th conference on File and storage technologies
FAST '09 Proccedings of the 7th conference on File and storage technologies
Scalable access controls for lineage
TAPP'09 First workshop on on Theory and practice of provenance
The case for browser provenance
TAPP'09 First workshop on on Theory and practice of provenance
Detecting and resolving unsound workflow views for correct provenance analysis
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Exploring Scientific Workflow Provenance Using Hybrid Queries over Nested Data and Lineage Graphs
SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
Provenance in Databases: Why, How, and Where
Foundations and Trends in Databases
Do You Know Where Your Data's Been? --- Tamper-Evident Database Provenance
SDM '09 Proceedings of the 6th VLDB Workshop on Secure Data Management
An Access Control Language for a General Provenance Model
SDM '09 Proceedings of the 6th VLDB Workshop on Secure Data Management
Preventing history forgery with secure provenance
ACM Transactions on Storage (TOS)
ACM Transactions on Storage (TOS)
A navigation model for exploring scientific workflow provenance graphs
Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science
Pipeline-centric provenance model
Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science
Provenance query evaluation: what's so special about it?
Proceedings of the 18th ACM conference on Information and knowledge management
Techniques for efficiently querying scientific workflow provenance graphs
Proceedings of the 13th International Conference on Extending Database Technology
Fine-grained and efficient lineage querying of collection-based workflow provenance
Proceedings of the 13th International Conference on Extending Database Technology
Proceedings of the 13th International Conference on Extending Database Technology
An optimal labeling scheme for workflow provenance using skeleton labels
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
RDFProv: A relational RDF store for querying and managing scientific workflow provenance
Data & Knowledge Engineering
Tracking back references in a write-anywhere file system
FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
Towards a secure and efficient system for end-to-end provenance
TAPP'10 Proceedings of the 2nd conference on Theory and practice of provenance
Preserving integrity and confidentiality of a directed acyclic graph model of provenance
DBSec'10 Proceedings of the 24th annual IFIP WG 11.3 working conference on Data and applications security and privacy
The Foundations for Provenance on the Web
Foundations and Trends in Web Science
Generating sound workflow views for correct provenance analysis
ACM Transactions on Database Systems (TODS)
Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Storage and use of provenance information for relational database queries
DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications: Part II
PrIMe: A methodology for developing provenance-aware applications
ACM Transactions on Software Engineering and Methodology (TOSEM)
Efficient storage and temporal query evaluation in hierarchical data archiving systems
SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part II
Providing flexible tradeoff for provenance tracking
WISS'10 Proceedings of the 2010 international conference on Web information systems engineering
Database support for exploring scientific workflow provenance graphs
SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
Context provenance to enhance the dependability of ambient intelligence systems
Personal and Ubiquitous Computing
ACM Transactions on Database Systems (TODS)
A framework for scalable distributed provenance storage system
Computer Standards & Interfaces
Efficient provenance storage for relational queries
Proceedings of the 21st ACM international conference on Information and knowledge management
A hybrid approach for efficient provenance storage
Proceedings of the 21st ACM international conference on Information and knowledge management
International Journal of Systems and Service-Oriented Engineering
WebLab PROV: computing fine-grained provenance links for XML artifacts
Proceedings of the Joint EDBT/ICDT 2013 Workshops
Ariadne: managing fine-grained provenance on data streams
Proceedings of the 7th ACM international conference on Distributed event-based systems
Evaluation of a Hybrid Approach for Efficient Provenance Storage
ACM Transactions on Storage (TOS)
A new compression algorithm of data provenance based on self-adaptive granularity
International Journal of Computer Applications in Technology
LogGC: garbage collecting audit log
Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security
Towards semantic comparison of multi-granularity process traces
Knowledge-Based Systems
Hi-index | 0.01 |
As the world is increasingly networked and digitized, the data we store has more and more frequently been chopped, baked, diced and stewed. In consequence, there is an increasing need to store and manage provenance for each data item stored in a database, describing exactly where it came from, and what manipulations have been applied to it. Storage of the complete provenance of each data item can become prohibitively expensive. In this paper, we identify important properties of provenance that can be used to considerably reduce the amount of storage required. We identify three different techniques: a family of factorization processes and two methods based on inheritance, to decrease the amount of storage required for provenance. We have used the techniques described in this work to significantly reduce the provenance storage costs associated with constructing MiMI [22], a warehouse of data regarding protein interactions, as well as two provenance stores, Karma [31] and PReServ [20], produced through workflow execution. In these real provenance sets, we were able to reduce the size of the provenance by up to a factor of 20. Additionally, we show that this reduced store can be queried efficiently and further that incremental changes can be made inexpensively.