A formal approach to undo operations in programming languages
ACM Transactions on Programming Languages and Systems (TOPLAS) - The MIT Press scientific computation series
Tools for supporting the collaborative process
UIST '92 Proceedings of the 5th annual ACM symposium on User interface software and technology
A selective undo mechanism for graphical user interfaces based on command objects
ACM Transactions on Computer-Human Interaction (TOCHI)
Querying and Managing Provenance through User Views in Scientific Workflows
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
A navigation model for exploring scientific workflow provenance graphs
Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science
Design and evaluation of a self-healing Kepler for scientific workflows
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
A fault-tolerance architecture for Kepler-based distributed scientific workflows
SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
Process programming to support medical safety: a case study on blood transfusion
SPW'05 Proceedings of the 2005 international conference on Unifying the Software Process Spectrum
Definition and analysis of election processes
SPW/ProSim'06 Proceedings of the 2006 international conference on Software Process Simulation and Modeling
TaPP'12 Proceedings of the 4th USENIX conference on Theory and Practice of Provenance
Hi-index | 0.00 |
This paper presents a provenance-based technique to support undoing and redoing data analysis tasks. Our technique targets scientists who experiment with combinations of approaches to processing raw data into presentable datasets. Raw data may be noisy and in need of cleaning, it may suffer from sensor drift that requires retrospective calibration and data correction, or it may need gap-filling due to sensor malfunction or environmental conditions. Different raw datasets may have different issues requiring different kinds of adjustments, and each issue may potentially be handled by different approaches. Thus, scientists must often experiment with different sequences of approaches. In our work, we show how provenance information can be used to facilitate this kind of experimentation with scientific datasets. We describe an approach that supports the ability to (1) undo a set of tasks while setting aside the artifacts and consequences of performing those tasks, (2) replace, remove, or add a data-processing technique, and (3) redo automatically those set aside tasks that are consistent with changed technique. We have implemented our technique and demonstrate its utility with a case study of a common, sensor-network, data-processing scenario showing how our approach can reduce the cost of changing intermediate data-processing techniques in a complex, data-intensive process.