An efficient quasi-identifier index based approach for privacy preservation over incremental data sets on cloud

Authors:
Xuyun Zhang;Chang Liu;Surya Nepal;Jinjun Chen
Affiliations:
Faculty of Engineering and Information Technology, University of Technology Sydney, PO Box 123, Broadway, NSW 2007, Australia;Faculty of Engineering and Information Technology, University of Technology Sydney, PO Box 123, Broadway, NSW 2007, Australia;Centre for Information & Communication Technologies, Commonwealth Scientific and Industrial Research Organisation, Cnr Vimiera and Pembroke Rodas Marsfield, NSW 2122, Australia;Faculty of Engineering and Information Technology, University of Technology Sydney, PO Box 123, Broadway, NSW 2007, Australia
Venue:
Journal of Computer and System Sciences
Year:
2013

Citing 38
Cited 2

Protecting Respondents' Identities in Microdata Release

IEEE Transactions on Knowledge and Data Engineering
Locality-sensitive hashing scheme based on p-stable distributions

SCG '04 Proceedings of the twentieth annual symposium on Computational geometry
Incognito: efficient full-domain K-anonymity

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Mondrian Multidimensional K-Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Personalized privacy preservation

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Utility-based anonymization using local recoding

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Anatomy: simple and effective privacy preservation

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions

FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
L-diversity: Privacy beyond k-anonymity

ACM Transactions on Knowledge Discovery from Data (TKDD)
K-anonymization incremental maintenance and optimization techniques

Proceedings of the 2007 ACM symposium on Applied computing
M-invariance: towards privacy preserving re-publication of dynamic datasets

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Maintaining K-Anonymity against Incremental Updates

SSDBM '07 Proceedings of the 19th International Conference on Scientific and Statistical Database Management
A secure information service for monitoring large scale grids

Parallel Computing
MapReduce: simplified data processing on large clusters

Communications of the ACM - 50th anniversary issue: 1958 - 2008
Anonymity for continuous data publishing

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Continuous privacy preserving publishing of data streams

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility

Future Generation Computer Systems
Fully homomorphic encryption using ideal lattices

Proceedings of the forty-first annual ACM symposium on Theory of computing
Recent Research Advances in e-Science

Cluster Computing
A view of cloud computing

Communications of the ACM
Privacy-preserving data publishing: A survey of recent developments

ACM Computing Surveys (CSUR)
Closeness: A New Privacy Measure for Data Publishing

IEEE Transactions on Knowledge and Data Engineering
Fuzzy keyword search over encrypted data in cloud computing

INFOCOM'10 Proceedings of the 29th conference on Information communications
Security and Privacy Challenges in Cloud Computing Environments

IEEE Security and Privacy
Cloud Hooks: Security and Privacy Issues in Cloud Computing

HICSS '11 Proceedings of the 2011 44th Hawaii International Conference on System Sciences
Understanding Cloud Computing Vulnerabilities

IEEE Security and Privacy
A platform for scalable one-pass analytics using MapReduce

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Nova: continuous Pig/Hadoop workflows

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Preventing equivalence attacks in updated, anonymized data

ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
Authorized Private Keyword Search over Encrypted Data in Cloud Computing

ICDCS '11 Proceedings of the 2011 31st International Conference on Distributed Computing Systems
Incoop: MapReduce for incremental computations

Proceedings of the 2nd ACM Symposium on Cloud Computing
Addressing cloud computing security issues

Future Generation Computer Systems
Parallel data processing with MapReduce: a survey

ACM SIGMOD Record
In Cloud, Can Scientific Communities Benefit from the Economies of Scale?

IEEE Transactions on Parallel and Distributed Systems
Secure anonymization for incremental datasets

SDM'06 Proceedings of the Third VLDB international conference on Secure Data Management
Fully homomorphic encryption over the integers

EUROCRYPT'10 Proceedings of the 29th Annual international conference on Theory and Applications of Cryptographic Techniques
A Secure Erasure Code-Based Cloud Storage System with Secure Data Forwarding

IEEE Transactions on Parallel and Distributed Systems
Stream as You Go: The Case for Incremental Data Access and Processing in the Cloud

ICDEW '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering Workshops

Foreword: Special issue of JCSS on UbiSafe computing and communications

Journal of Computer and System Sciences
Big data: a research agenda

Proceedings of the 17th International Database Engineering & Applications Symposium

Quantified Score

Hi-index	0.00

Visualization

Abstract

Cloud computing provides massive computation power and storage capacity which enable users to deploy applications without infrastructure investment. Many privacy-sensitive applications like health services are built on cloud for economic benefits and operational convenience. Usually, data sets in these applications are anonymized to ensure data owners@? privacy, but the privacy requirements can be potentially violated when new data join over time. Most existing approaches address this problem via re-anonymizing all data sets from scratch after update or via anonymizing the new data incrementally according to the already anonymized data sets. However, privacy preservation over incremental data sets is still challenging in the context of cloud because most data sets are of huge volume and distributed across multiple storage nodes. Existing approaches suffer from poor scalability and inefficiency because they are centralized and access all data frequently when update occurs. In this paper, we propose an efficient quasi-identifier index based approach to ensure privacy preservation and achieve high data utility over incremental and distributed data sets on cloud. Quasi-identifiers, which represent the groups of anonymized data, are indexed for efficiency. An algorithm is designed to fulfil our approach accordingly. Evaluation results demonstrate that with our approach, the efficiency of privacy preservation on large-volume incremental data sets can be improved significantly over existing approaches.