A Genetic Approach to Multivariate Microaggregation for Database Privacy

  • Authors:
  • Antoni Martinez-Balleste;Agusti Solanas;Josep Domingo-Ferrer;Josep M. Mateo-Sanz

  • Affiliations:
  • Dept. of Computer Engineering and Maths, Universitat Rovira i Virgili, Av. Països Catalans 26, E-43007 Tarragona, Catalonia. e-mail antoni.martinez@urv.cat;Dept. of Computer Engineering and Maths, Universitat Rovira i Virgili, Av. Països Catalans 26, E-43007 Tarragona, Catalonia. e-mail agusti.solanas@urv.cat;Dept. of Computer Engineering and Maths, Universitat Rovira i Virgili, Av. Països Catalans 26, E-43007 Tarragona, Catalonia. e-mail josep.domingo@urv.cat;Dept. of Computer Engineering and Maths, Universitat Rovira i Virgili, Av. Països Catalans 26, E-43007 Tarragona, Catalonia. e-mail josepmaria.mateo@urv.cat

  • Venue:
  • ICDEW '07 Proceedings of the 2007 IEEE 23rd International Conference on Data Engineering Workshop
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Microaggregation is a technique used to protect privacy in databases and location-based services. We propose a new hybrid technique for multivariate microaggregation. Our technique combines a heuristic yielding fixed-size groups and a genetic algorithm yielding variable-sized groups. Fixed-size heuristics are fast and able to deal with large data sets, but they sometimes are far from optimal in terms of the information loss inflicted. On the other hand, the genetic algorithm obtains very good results (i.e. optimal or near optimal), but it can only cope with very small data sets. Our technique leverages the advantages of both types of heuristics and avoids their shortcomings. First, it partitions the data set into a number of groups by using a fixed-size heuristic. Then, it optimizes the partitions by means of the genetic algorithm. As an outcome of this mixture of heuristics, we obtain a technique that improves the results of the fixed-size heuristic in large data sets.