Heuristic supervised approach for record linkage

  • Authors:
  • Javier Murillo;Daniel Abril;Vicenç Torra

  • Affiliations:
  • CIFASIS-CONICET, Universidad Nacional de Rosario, Argentina;Universitat Autònoma de Barcelona (UAB), Barcelona, Spain,Institut d'Investigació en Intel·ligència Artificial(IIIA), Consejo Superior de Investigaciones Científicas (CSIC ...;Institut d'Investigació en Intel·ligència Artificial(IIIA), Consejo Superior de Investigaciones Científicas (CSIC), Barcelona, Spain

  • Venue:
  • MDAI'12 Proceedings of the 9th international conference on Modeling Decisions for Artificial Intelligence
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Record linkage is a well known technique used to link records from one database to records from another database which make reference to the same individuals. Although it is usually used in database integration, it is also used in the data privacy field for the disclosure risk evaluation of protected datasets. In this paper we compare two different supervised algorithms which rely on distance-based record linkage techniques, specifically using the Choquet integral's fuzzy integral to compute the distance between records. The first approach uses a linear optimization problem which determines the optimal fuzzy measure for the linkage. While, the second approach is a kind of gradient algorithm with constraints for the fuzzy measures' identification. We show the advantages and drawbacks of both algorithms and also in which situations they will work better.