Microaggregation- and permutation-based anonymization of movement data

  • Authors:
  • Josep Domingo-Ferrer;Rolando Trujillo-Rasua

  • Affiliations:
  • Universitat Rovira i Virgili, Department of Computer Engineering and Mathematics, UNESCO Chair in Data Privacy, Av. Paısos Catalans 26, E-43007 Tarragona, Catalonia, Spain;Universitat Rovira i Virgili, Department of Computer Engineering and Mathematics, UNESCO Chair in Data Privacy, Av. Paısos Catalans 26, E-43007 Tarragona, Catalonia, Spain

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2012

Quantified Score

Hi-index 0.07

Visualization

Abstract

Movement data, that is, trajectories of mobile objects, are automatically collected in huge quantities by technologies such as GPS, GSM or RFID, among others. Publishing and exploiting such data is essential to improve transportation, to understand the dynamics of the economy in a region, etc. However, there are obvious threats to the privacy of individuals if their trajectories are published in a way which allows re-identification of the individual behind a trajectory. We contribute to the literature on privacy-preserving publication of trajectories by presenting a distance measure for trajectories which naturally considers both spatial and temporal aspects of trajectories, is computable in polynomial time, and can cluster trajectories not defined over the same time span. Our distance measure can be naturally instantiated using other existing similarity measures for trajectories that are appropriate for anonymization purposes. Then, we propose two heuristics for trajectory anonymization which yield anonymized trajectories formed by fully accurate true original locations. The first heuristic is based on trajectory microaggregation using the above distance and on location permutation; it effectively achieves trajectory k-anonymity. The second heuristic is based only on location permutation; it gives up trajectory k-anonymity and aims at location k-diversity. The strong point of the second heuristic is that it takes into account reachability constraints when computing anonymized trajectories. Experimental results on a synthetic data set and a real-life data set are presented; for similar privacy protection levels and most reasonable parameter choices, our two methods offer better utility than comparable previous proposals in the literature.