AJAX: an extensible data cleaning tool

  • Authors:
  • Helena Galhardas;Daniela Florescu;Dennis Shasha;Eric Simon

  • Affiliations:
  • INRIA Rocquencourt, France;INRIA Rocquencourt, France;Courant Institute, NY;INRIA Rocquencourt, France

  • Venue:
  • SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

@@@@ groups together matching pairs with a high similarity value by applying a given grouping criteria (e.g. by transitive closure). Finally, ging collapses each individual cluster into a tuple of the resulting data source. AJAX provides @@@@ for specifying data cleaning programs, which consists of SQL statements enriched with a set of specific primitives to express these transformations.AJAX also @@@@. It allows the user to interact with an executing data cleaning program to handle exceptional cases and to inspect intermediate results. Finally, AJAX provides @@@@ @@@@ that permits users to determine the source and processing of data for debugging purposes.We will present the AJAX system applied to two real world problems: the consolidation of a telecommunication database, and the conversion of a dirty database of bibliographic references into a set of clean, normalized, and redundancy free relational tables maintaining the same data.