k-anonymity: a model for protecting privacy
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Detecting privacy and ethical sensitivity in data mining results
ACSC '04 Proceedings of the 27th Australasian conference on Computer science - Volume 26
\ell -Diversity: Privacy Beyond \kappa -Anonymity
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Revisiting the uniqueness of simple demographics in the US population
Proceedings of the 5th ACM workshop on Privacy in electronic society
SS'07 Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium
Detecting privacy leaks using corpus-based association rules
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Private Data Discovery for Privacy Compliance in Collaborative Environments
CDVE '08 Proceedings of the 5th international conference on Cooperative Design, Visualization, and Engineering
Automatic Detecting Documents Containing Personal Health Information
AIME '09 Proceedings of the 12th Conference on Artificial Intelligence in Medicine: Artificial Intelligence in Medicine
Privacy-enhanced web personalization
The adaptive web
de-linkability: a privacy-preserving constraint for safely outsourcing multimedia documents
Proceedings of the Fifth International Conference on Management of Emergent Digital EcoSystems
Hi-index | 0.00 |
Privacy compliance for free text documents is a challenge facing many organizations. Named entity recognition techniques and machine learning methods can be used to detect private information, such as personally identifiable information (PII) and personal health information (PHI) in free text documents. However, these methods cannot measure the level of privacy embodied in the documents. In this paper, we propose a framework to measure the privacy content in free text documents. The measure consists of two factors: the probability that the text can be used to uniquely identify a person and the degree of sensitivity of the private entities associated with the person. We then instantiate the framework in the scenario of detection and protection of PHI in medical records, which is a challenge for many hospitals, clinics, and other medical institutions. We did experiments on a real dataset to show the effectiveness of the proposed measure.