A machine learning solution to assess privacy policy completeness: (short paper)

  • Authors:
  • Elisa Costante;Yuanhao Sun;Milan Petković;Jerry den Hartog

  • Affiliations:
  • TU/e, Eindhoven, Netherlands;TU/e, Eindhoven, Netherlands;Tu/e & Philiphs Research, Eindhoven, Netherlands;TU/e, Eindhoven, Netherlands

  • Venue:
  • Proceedings of the 2012 ACM workshop on Privacy in the electronic society
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

A privacy policy is a legal document, used by websites to communicate how the personal data that they collect will be managed. By accepting it, the user agrees to release his data under the conditions stated by the policy. Privacy policies should provide enough information to enable users to make informed decisions. Privacy regulations support this by specifying what kind of information has to be provided. As privacy policies can be long and difficult to understand, users tend not to read them. Because of this, users generally agree with a policy without knowing what it states and whether aspects important to him are covered at all. In this paper we present a solution to assist the user by providing a structured way to browse the policy content and by automatically assessing the completeness of a policy, i.e. the degree of coverage of privacy categories important to the user. The privacy categories are extracted from privacy regulations, while text categorization and machine learning techniques are used to verify which categories are covered by a policy. The results show the feasibility of our approach; an automatic classifier, able to associate the right category to paragraphs of a policy with an accuracy approximating that obtainable by a human judge, can be effectively created.