Automatically identifying relations in privacy policies

Authors:
John W. Stamey;Ryan A. Rossi
Affiliations:
Coastal Carolina University, Conway, SC, USA;NASA Jet Propulsion Laboratory, Pasadena, CA, USA
Venue:
Proceedings of the 27th ACM international conference on Design of communication
Year:
2009

Citing 2
Cited 1

Understanding search engines: mathematical modeling and text retrieval

Understanding search engines: mathematical modeling and text retrieval
What's wrong with online privacy policies?

Communications of the ACM - ACM's plan to go online first

Refinement checking for privacy policies

Science of Computer Programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

E-commerce privacy policies tend to consist of many ambiguities in language that protects companies more than the customers. Types of ambiguities found are currently divided into four patterns: mitigation (downplaying frequency), enhancement (emphasizing nonessential qualities), obfuscation (hedging claims and obscuring causality), and omission (removing agents). A number of phrases have been identified as creating ambiguities within these four categories. When a customer accepts the terms and conditions of a privacy policy, words and phrases (from the category of mitigation) such as "occasionally" or "from time to time" actually give the e-commerce vendor permission to send as many spamming email offers as they deem necessary . Our study uses techniques based on Latent Semantic Analysis to discover the underlying semantic relations between words in privacy policies. Additional potential ambiguities and other word relations are found automatically. Words are clustered according to their topic in privacy policies using principal directions. This provides us with a ranking of the most significant words from each clustered topic as well as a ranking of the privacy policy topics. We also extract a signature that forms the basis of a typical privacy policy. These results lead to the design of a system used to analyze privacy policies called Hermes. Given an arbitrary privacy policy our system provides a list of the potential ambiguities along with a score that represents the similarity to a typical privacy policy.