Privacy and anonymization for very large datasets

  • Authors:
  • Victor Muntés-Mulero;Jordi Nin

  • Affiliations:
  • DAMA-UPC. Universitat Politècnica de Catalunya, Barcelona, Spain;Centre National de la Recherche Scientifique, Toulouse, France

  • Venue:
  • Proceedings of the 18th ACM conference on Information and knowledge management
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the increase of available public data sources and the interest for analyzing them, privacy issues are becoming the eye of the storm in many applications. The vast amount of data collected on human beings and organizations as a result of cyberinfrastructure advances, or that collected by statistical agencies, for instance, has made traditional ways of protecting social science data obsolete. This has given rise to different techniques aimed at tackling this problem and at the analysis of limitations in such environments, such as the seminal study by Aggarwal of anonymization techniques and their dependency on data dimensionality. The growing accessibility to high-capacity storage devices allows keeping more detailed information from many areas. While this enriches the information and conclusions extracted from this data, it poses a serious problem for most of the previous work presented up to now regarding privacy, focused on quality and paying little attention to performance aspects. In this workshop, we want to gather researchers in the areas of data privacy and anonymization together with researchers in the area of high performance and very large data volumes management. We seek to collect the most recent advances in data privacy and anonymization (i.e. anonymization techniques, statistic disclosure techniques, privacy in machine learning algorithms, privacy in graphs or social networks, etc) and those in High Performance and Data Management (i.e. algorithms and structures for efficient data management, parallel or distributed systems, etc).