A publication process model to enable privacy-aware data sharing

Authors:
A. Gkoulalas-Divanis;E. W. Cope
Affiliations:
IBM Research Division, Zurich Research Laboratory, Rüschlikon, Switzerland;IBM Research Division, Zurich Research Laboratory, Rüschlikon, Switzerland
Venue:
IBM Journal of Research and Development
Year:
2011

Citing 15
Cited 0

Approximate string-matching with q-grams and maximal matches

Theoretical Computer Science - Selected papers of the Combinatorial Pattern Matching School
Privacy-preserving data mining

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Automating the approximate record-matching process

Information Sciences—Informatics and Computer Science: An International Journal
Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem

Data Mining and Knowledge Discovery
Protecting Respondents' Identities in Microdata Release

IEEE Transactions on Knowledge and Data Engineering
Interactive deduplication using active learning

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Transforming data to satisfy privacy constraints

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
\ell -Diversity: Privacy Beyond \kappa -Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Adaptive Name Matching in Information Integration

IEEE Intelligent Systems
Injecting utility into anonymized datasets

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Duplicate Record Detection: A Survey

IEEE Transactions on Knowledge and Data Engineering
Privacy-Preserving Data Mining: Models and Algorithms

Privacy-Preserving Data Mining: Models and Algorithms
Privacy-preserving anonymization of set-valued data

Proceedings of the VLDB Endowment
On the Anonymization of Sparse High-Dimensional Data

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
COAT: COnstraint-based anonymization of transactions

Knowledge and Information Systems - Special Issue on "Context-Aware Data Mining (CADM)"

Quantified Score

Hi-index	0.00

Visualization

Abstract

As the Internet continues to permeate and connect communities, businesses, and things, there is an increasing demand for new approaches and technologies to analyze and synthesize data generated from diverse and distributed sources. In addition, this data must be accessible to a set of users having different analytic objectives and viewpoints. We examine these topics in light of the growing number of data consortia in sectors such as finance and healthcare, whose role is to share data among a set of contributing members. We address the need for data consortia to apply data customization and context-alignment services to make heterogeneous data relevant for its subscribers. Such services include record linkage, record selection, and scaling and homogeneity analysis. In addition, the often personal or business-sensitive nature of such data requires that privacy-preservation methods be employed to avoid improper disclosures. We provide a publication process model for data consortia that allow users to extract the maximum amount of information from these heterogeneous databases in a privacy-aware manner. We describe the Operational Riskdata eXchange (ORX) as a successful case study to illustrate these concepts.