Measuring and Comparing Effectiveness of Data Quality Techniques

  • Authors:
  • Lei Jiang;Daniele Barone;Alex Borgida;John Mylopoulos

  • Affiliations:
  • Dept. of Computer Science, University of Toronto,;Dept. of Computer Science, Università di Milano Bicocca,;Dept. of Computer Science, University of Toronto, and Dept. of Computer Science, Rutgers University,;Dept. of Computer Science, University of Toronto, and Dept. of Information Engineering and Computer Science, University of Trento,

  • Venue:
  • CAiSE '09 Proceedings of the 21st International Conference on Advanced Information Systems Engineering
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Poor quality data may be detected and corrected by performing various quality assurance activities that rely on techniques with different efficacy and cost. In this paper, we propose a quantitative approach for measuring and comparing the effectiveness of these data quality (DQ) techniques. Our definitions of effectiveness are inspired by measures proposed in Information Retrieval. We show how the effectiveness of a DQ technique can be mathematically estimated in general cases, using formal techniques that are based on probabilistic assumptions. We then show how the resulting effectiveness formulas can be used to evaluate, compare and make choices involving DQ techniques.