Privacy risks in health databases from aggregate disclosure

Authors:
Gautam Das;Nan Zhang
Affiliations:
The University of Texas at Arlington, Arlington, TX;The George Washington University, Washington, DC
Venue:
Proceedings of the 2nd International Conference on PErvasive Technologies Related to Assistive Environments
Year:
2009

Citing 12
Cited 1

Security-control methods for statistical databases: a comparative study

ACM Computing Surveys (CSUR)
Privacy-preserving data mining

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Tools for privacy preserving distributed data mining

ACM SIGKDD Explorations Newsletter
k-anonymity: a model for protecting privacy

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Disclosure Limitation of Sensitive Rules

KDEX '99 Proceedings of the 1999 Workshop on Knowledge and Data Engineering Exchange
Information sharing across private databases

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Association Rule Hiding

IEEE Transactions on Knowledge and Data Engineering
Privacy preserving OLAP

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
An integer programming approach for frequent itemset hiding

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
A random walk approach to sampling hidden databases

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Privacy-Preserving Data Mining Systems

Computer
Leveraging COUNT Information in Sampling Hidden Databases

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering

Revisiting sequential pattern hiding to enhance utility

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper focuses on privacy risks in health databases that arise in assistive environments, where humans interact with the environment and this information is captured, assimilated and events of interest are extracted. The stakeholders of such an environment can range from caregivers to doctors and supporting family. The environment also includes objects the person interacts with, such as, wireless devices that generate data about these interactions. The data streams generated by such an environment are massive. Such databases are usually considered hidden, i.e., are only accessible online via restrictive front-end web interfaces. Security issues specific to such hidden databases, however, have been largely overlooked by the research community, possibly due to the false sense of security provided by the restrictive access to such databases. We argue that an urgent challenge facing such databases is the disclosure of sensitive aggregates enabled by recent studies on the sampling of hidden databases through its public web interface. To protect sensitive aggregates, we enunciate the key design principles, propose a three-component design, and suggest a number of possible techniques that may protect sensitive aggregates while maintaining the service quality for normal search users. Our hope is that this paper sheds lights on a fruitful direction of future research in security issues related to hidden web databases.