Tracing the lineage of view data in a warehousing environment
ACM Transactions on Database Systems (TODS)
On propagation of deletions and annotations through views
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Aurora: a new model and architecture for data stream management
The VLDB Journal — The International Journal on Very Large Data Bases
MONDRIAN: Annotating and Querying Databases through Colors and Blocks
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Debugging schema mappings with routes
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Intensional associations between data and metadata
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
A time-and-value centric provenance model and architecture for medical event streams
Proceedings of the 1st ACM SIGMOBILE international workshop on Systems and networking support for healthcare and assisted living environments
Persisting and querying biometric event streams with hybrid relational-XML DBMS
Proceedings of the 2007 inaugural international conference on Distributed event-based systems
Tribeca: a system for managing large databases of network traffic
ATEC '98 Proceedings of the annual conference on USENIX Annual Technical Conference
SPC: a distributed, scalable platform for data mining
Proceedings of the 4th international workshop on Data mining standards, services and platforms
Century: Automated Aspects of Patient Care
RTCSA '07 Proceedings of the 13th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications
Storage optimization for large-scale distributed stream-processing systems
ACM Transactions on Storage (TOS)
Provenance in collection-oriented scientific workflows
Concurrency and Computation: Practice & Experience - The First Provenance Challenge
A protocol for recording provenance in service-oriented grids
OPODIS'04 Proceedings of the 8th international conference on Principles of Distributed Systems
Towards low overhead provenance tracking in near real-time stream filtering
IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
Performance evaluation of the karma provenance framework for scientific workflows
IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
Assuring data trustworthiness: concepts and research challenges
SDM'10 Proceedings of the 7th VLDB conference on Secure data management
The Foundations for Provenance on the Web
Foundations and Trends in Web Science
Visual debugging for stream processing applications
RV'10 Proceedings of the First international conference on Runtime verification
IPAW'12 Proceedings of the 4th international conference on Provenance and Annotation of Data and Processes
Provenance for seismological processing pipelines in a distributed streaming workflow
Proceedings of the Joint EDBT/ICDT 2013 Workshops
Ariadne: managing fine-grained provenance on data streams
Proceedings of the 7th ACM international conference on Distributed event-based systems
Hi-index | 0.00 |
While data provenance is a well-studied topic in both database and workflow systems, its support within stream processing systems presents a new set of challenges. Part of the challenge is the high stream event rate and the low processing latency requirements imposed by many streaming applications. For example, emerging streaming applications in healthcare or finance call for data provenance, as illustrated in the Century stream processing infrastructure that we are building for supporting online healthcare analytics. At anytime, given an output data element (e.g., a medical alert) generated by Century, the system must be able to retrieve the input and intermediate data elements that led to its generation. In this paper, we describe the requirements behind our initial implementation of Century's provenance subsystem. We then analyze its strengths and limitations and propose a new provenance architecture to address some of these limitations. The paper also includes a discussion on the open challenges in this area.