Experiment explorer: lightweight provenance search over metadata
TaPP'12 Proceedings of the 4th USENIX conference on Theory and Practice of Provenance
SourceTrac: tracing data sources within spreadsheets
IPAW'12 Proceedings of the 4th international conference on Provenance and Annotation of Data and Processes
Characterizing workflow-based activity on a production e-infrastructure using provenance data
Future Generation Computer Systems
Hi-index | 0.03 |
Large experiments on distributed infrastructures become increasingly complex to manage, in particular to trace all computations that gave origin to a piece of data or an event such as an error. The work presented in this paper describes the design and implementation of an architecture to support experiment provenance and its deployment in the concrete case of a particular e-infrastructure for biosciences. The proposed solution consists of: (a) a data provenance repository to capture scientific experiments and their execution path, (b) a software tool (crawler) that gathers, classifies, links, and stores the information collected from various sources, and (c) a set of user interfaces through which the end-user can access the provenance data, interpret the results, and trace the sources of failure. The approach is based on an OPM-compliant API, PLIER, that is flexible to support future extensions and facilitates interoperability among heterogeneous application systems.