Data quality through model checking techniques

  • Authors:
  • Mario Mezzanzanica;Roberto Boselli;Mirko Cesarini;Fabio Mercorio

  • Affiliations:
  • Department of Statistics, C.R.I.S.P. research center, University of Milan Bicocca, Italy;Department of Statistics, C.R.I.S.P. research center, University of Milan Bicocca, Italy;Department of Statistics, C.R.I.S.P. research center, University of Milan Bicocca, Italy;Department of Computer Science, University of L'Aquila, Italy

  • Venue:
  • IDA'11 Proceedings of the 10th international conference on Advances in intelligent data analysis X
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The paper introduces the Robust Data Quality Analysis which exploits formal methods to support Data Quality Improvement Processes. The proposed methodology can be applied to data sources containing sequences of events that can be modelled by Finite State Systems. Consistency rules (derived from domain business rules) can be expressed by formal methods and can be automatically verified on data, both before and after the execution of cleansing activities. The assessment results can provide useful information to improve the data quality processes. The paper outlines the preliminary results of the methodology applied to a real case scenario: the cleansing of a very low quality database, containing the work careers of the inhabitants of an Italian province. The methodology has proved successful, by giving insights on the data quality levels and by providing suggestions on how to ameliorate the overall data quality process.