Privacy and anonymization for very large datasets

Authors:
Victor Muntés-Mulero;Jordi Nin
Affiliations:
DAMA-UPC. Universitat Politècnica de Catalunya, Barcelona, Spain;Centre National de la Recherche Scientifique, Toulouse, France
Venue:
Proceedings of the 18th ACM conference on Information and knowledge management
Year:
2009

Citing 13
Cited 2

A fast filtering scheme for large database cleansing

Proceedings of the eleventh international conference on Information and knowledge management
Practical Data-Oriented Microaggregation for Statistical Disclosure Control

IEEE Transactions on Knowledge and Data Engineering
Achieving k-anonymity privacy protection using generalization and suppression

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Minimum Spanning Tree Partitioning Algorithm for Microaggregation

IEEE Transactions on Knowledge and Data Engineering
On k-anonymity and the curse of dimensionality

VLDB '05 Proceedings of the 31st international conference on Very large data bases
A 2^d-Tree-Based Blocking Method for Microaggregating Very Large Data Sets

ARES '06 Proceedings of the First International Conference on Availability, Reliability and Security
L-diversity: Privacy beyond k-anonymity

ACM Transactions on Knowledge Discovery from Data (TKDD)
Towards identity anonymization on graphs

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Automatic record linkage using seeded nearest neighbour and support vector machine classification

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Parallelizing Record Linkage for Disclosure Risk Assessment

PSD '08 Proceedings of the UNESCO Chair in data privacy international conference on Privacy in Statistical Databases
Preserving Privacy in Social Networks Against Neighborhood Attacks

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Data Access in a Cyber World: Making Use of Cyberinfrastructure

Transactions on Data Privacy
Preserving the privacy of sensitive relationships in graph data

PinKDD'07 Proceedings of the 1st ACM SIGKDD international conference on Privacy, security, and trust in KDD

Social networking applications in health care: threats to the privacy and security of health information

Proceedings of the 2010 ICSE Workshop on Software Engineering in Health Care
Privacy-preserving statistical analysis on ubiquitous health data

TrustBus'11 Proceedings of the 8th international conference on Trust, privacy and security in digital business

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the increase of available public data sources and the interest for analyzing them, privacy issues are becoming the eye of the storm in many applications. The vast amount of data collected on human beings and organizations as a result of cyberinfrastructure advances, or that collected by statistical agencies, for instance, has made traditional ways of protecting social science data obsolete. This has given rise to different techniques aimed at tackling this problem and at the analysis of limitations in such environments, such as the seminal study by Aggarwal of anonymization techniques and their dependency on data dimensionality. The growing accessibility to high-capacity storage devices allows keeping more detailed information from many areas. While this enriches the information and conclusions extracted from this data, it poses a serious problem for most of the previous work presented up to now regarding privacy, focused on quality and paying little attention to performance aspects. In this workshop, we want to gather researchers in the areas of data privacy and anonymization together with researchers in the area of high performance and very large data volumes management. We seek to collect the most recent advances in data privacy and anonymization (i.e. anonymization techniques, statistic disclosure techniques, privacy in machine learning algorithms, privacy in graphs or social networks, etc) and those in High Performance and Data Management (i.e. algorithms and structures for efficient data management, parallel or distributed systems, etc).