Quality views: capturing and exploiting the user perspective on data quality

  • Authors:
  • Paolo Missier;Suzanne Embury;Mark Greenwood;Alun Preece;Binling Jin

  • Affiliations:
  • School of Computer Science, University of Manchester, Manchester, UK;School of Computer Science, University of Manchester, Manchester, UK;School of Computer Science, University of Manchester, Manchester, UK;Computing Science, University of Aberdeen, Aberdeen, UK;Computing Science, University of Aberdeen, Aberdeen, UK

  • Venue:
  • VLDB '06 Proceedings of the 32nd international conference on Very large data bases
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

There is a growing awareness among life scientists of the variability in quality of the data in public repositories, and of the threat that poor data quality poses to the validity of experimental results. No standards are available, however, for computing quality levels in this data domain. We argue that data processing environments used by life scientists should feature facilities for expressing and applying quality-based, personal data acceptability criteria.We propose a framework for the specification of users' quality processing requirements, called quality views. These views are compiled and semi-automatically embedded within the data processing environment. The result is a quality management toolkit that promotes rapid prototyping and reuse of quality components. We illustrate the utility of the framework by showing how it can be deployed within Taverna, a scientific workflow management tool, and applied to actual workflows for data analysis in proteomics.