Approximate string-matching with q-grams and maximal matches
Theoretical Computer Science - Selected papers of the Combinatorial Pattern Matching School
Privacy-preserving data mining
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Automating the approximate record-matching process
Information Sciences—Informatics and Computer Science: An International Journal
Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem
Data Mining and Knowledge Discovery
Protecting Respondents' Identities in Microdata Release
IEEE Transactions on Knowledge and Data Engineering
Interactive deduplication using active learning
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Transforming data to satisfy privacy constraints
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
\ell -Diversity: Privacy Beyond \kappa -Anonymity
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Adaptive Name Matching in Information Integration
IEEE Intelligent Systems
Injecting utility into anonymized datasets
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Duplicate Record Detection: A Survey
IEEE Transactions on Knowledge and Data Engineering
Privacy-Preserving Data Mining: Models and Algorithms
Privacy-Preserving Data Mining: Models and Algorithms
Privacy-preserving anonymization of set-valued data
Proceedings of the VLDB Endowment
On the Anonymization of Sparse High-Dimensional Data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
COAT: COnstraint-based anonymization of transactions
Knowledge and Information Systems - Special Issue on "Context-Aware Data Mining (CADM)"
Hi-index | 0.00 |
As the Internet continues to permeate and connect communities, businesses, and things, there is an increasing demand for new approaches and technologies to analyze and synthesize data generated from diverse and distributed sources. In addition, this data must be accessible to a set of users having different analytic objectives and viewpoints. We examine these topics in light of the growing number of data consortia in sectors such as finance and healthcare, whose role is to share data among a set of contributing members. We address the need for data consortia to apply data customization and context-alignment services to make heterogeneous data relevant for its subscribers. Such services include record linkage, record selection, and scaling and homogeneity analysis. In addition, the often personal or business-sensitive nature of such data requires that privacy-preservation methods be employed to avoid improper disclosures. We provide a publication process model for data consortia that allow users to extract the maximum amount of information from these heterogeneous databases in a privacy-aware manner. We describe the Operational Riskdata eXchange (ORX) as a successful case study to illustrate these concepts.