Semandaq: a data quality system based on conditional functional dependencies

  • Authors:
  • Wenfei Fan;Floris Geerts;Xibei Jia

  • Affiliations:
  • University of Edinburgh and Bell Laboratories;University of Edinburgh;University of Edinburgh

  • Venue:
  • Proceedings of the VLDB Endowment
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present Semandaq, a prototype system for improving the quality of relational data. Based on the recently proposed conditional functional dependencies (CFDs), it detects and repairs errors and inconsistencies that emerge as violations of these constraints. We demonstrate the following functionalities supported by Semandaq: (a) an interface for specifying CFDs; (b) a visual tool for automated detection of CFD violations in relational data, leveraging efficient SQL-based techniques; (c) extensive visual data exploration capabilities that provide the user with various measures of the quality of the data; (d) repair (cleaning) functionality without excess human interaction, built upon CFD-based cleaning algorithms; we show how Semandaq allows for a natural exploration of the quality of the obtained repairs. Semandaq is a promising tool that provides easy access and user-friendly data quality facilities for any relational database system.