Feature Subset Selection in Text-Learning
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Improving Text Classification by Shrinkage in a Hierarchy of Classes
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Achieving k-anonymity privacy protection using generalization and suppression
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
\ell -Diversity: Privacy Beyond \kappa -Anonymity
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
"I know what you did last summer": query logs and user privacy
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Efficient signature schemes supporting redaction, pseudonymization, and data deidentification
Proceedings of the 2008 ACM symposium on Information, computer and communications security
Detecting privacy leaks using corpus-based association rules
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Privacy-preserving anonymization of set-valued data
Proceedings of the VLDB Endowment
Efficient techniques for document sanitization
Proceedings of the 17th ACM conference on Information and knowledge management
Vanity fair: privacy in querylog bundles
Proceedings of the 17th ACM conference on Information and knowledge management
Sanitization's slippery slope: the design and study of a text revision assistant
Proceedings of the 5th Symposium on Usable Privacy and Security
ICALP'06 Proceedings of the 33rd international conference on Automata, Languages and Programming - Volume Part II
Hi-index | 0.00 |
This paper describes a framework and algorithms for ensuring a specified level of privacy in text data sets. Recent work has attempted to quantify the likelihood of privacy breaches for text data. We build on these notions to provide a means of controlling such breaches, couched in a multi-class classification framework. Our framework, called Text Inference Control, gives the user fine-grained control over the level of privacy needed for sensitive concepts present in that data. Additionally, our framework is designed to respect a user-defined utility metric on the data, which our methods try to maximize while redacting. In addition to our framework and algorithms, we show encouraging results on protecting the sensitive category while maximizing the preservation of the utility category on multiple data sets, against both automated attackers and human subjects.