Linguistic properties of multi-word passphrases

  • Authors:
  • Joseph Bonneau;Ekaterina Shutova

  • Affiliations:
  • Computer Laboratory, University of Cambridge, UK;Computer Laboratory, University of Cambridge, UK

  • Venue:
  • FC'12 Proceedings of the 16th international conference on Financial Cryptography and Data Security
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We examine patterns of human choice in a passphrase-based authentication system deployed by Amazon, a large online merchant. We tested the availability of a large corpus of over 100,000 possible phrases at Amazon's registration page, which prohibits using any phrase already registered by another user. A number of large, readily-available lists such as movie and book titles prove effective in guessing attacks, suggesting that passphrases are vulnerable to dictionary attacks like all schemes involving human choice. Extending our analysis with natural language phrases extracted from linguistic corpora, we find that phrase selection is far from random, with users strongly preferring simple noun bigrams which are common in natural language. The distribution of chosen passphrases is less skewed than the distribution of bigrams in English text, indicating that some users have attempted to choose phrases randomly. Still, the distribution of bigrams in natural language is not nearly random enough to resist offline guessing, nor are longer three- or four-word phrases for which we see rapidly diminishing returns.