Towards semantic microaggregation of categorical data for confidential documents

  • Authors:
  • Daniel Abril;Guillermo Navarro-Arribas;Vicenç Torra

  • Affiliations:
  • Institut d'Investigació en Intel-ligència Artificial, Consejo Superior de Investigaciones Científicas, Catalonia, Spain;Institut d'Investigació en Intel-ligència Artificial, Consejo Superior de Investigaciones Científicas, Catalonia, Spain;Institut d'Investigació en Intel-ligència Artificial, Consejo Superior de Investigaciones Científicas, Catalonia, Spain

  • Venue:
  • MDAI'10 Proceedings of the 7th international conference on Modeling decisions for artificial intelligence
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the data privacy context, specifically, in statistical disclosure control techniques, microaggregation is a well-known microdata protection method, ensuring the confidentiality of each individual. In this paper, we propose a new approach of microaggregation to deal with semantic sets of categorical data, like text documents. This method relies on the WordNet framework that provides complete semantic relationship taxonomy between words. Therefore, this extension aims ensure the confidentiality of text documents, but at the same time, it should preserve the general meaning. We apply some measures to evaluate the quality of the protection method relying on information loss.