SourceTrac: tracing data sources within spreadsheets

  • Authors:
  • Hazeline U. Asuncion

  • Affiliations:
  • Computing and Software Systems, University of Washington, Bothell, Bothell, WA

  • Venue:
  • IPAW'12 Proceedings of the 4th international conference on Provenance and Annotation of Data and Processes
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Analyzing data from multiple sources is a common task in scientific research. In particular, spreadsheet data is often aggregated from a variety of sources to identify patterns and synthesize reports. Yet, techniques are lacking for automatically capturing the provenance of such data within spreadsheet environments like Excel. We present a novel approach for fine-grained tracing of tabular data that may have been obtained from files, databases, or the Web. Our approach provides relevant provenance information at both the micro-level (per cell) and the macro-level (per sheet). Initial results suggest that our approach is scalable and beneficial to data analysts.