A natural language approach to automated cryptanalysis of two-time pads

  • Authors:
  • Joshua Mason;Kathryn Watkins;Jason Eisner;Adam Stubblefield

  • Affiliations:
  • Johns Hopkins University;Johns Hopkins University;Johns Hopkins University;Johns Hopkins University

  • Venue:
  • Proceedings of the 13th ACM conference on Computer and communications security
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

While keystream reuse in stream ciphers and one-time pads has been a well known problem for several decades, the risk to real systems has been underappreciated. Previous techniques have relied on being able to accurately guess words and phrases that appear in one of the plaintext messages, making it far easier to claim that "an attacker would never be able to do that." In this paper, we show how an adversary can automatically recover messages encrypted under the same keystream if only the type of each message is known (e.g. an HTML page in English). Our method, which is related to HMMs, recovers the most probable plaintext of this type by using a statistical language model and a dynamic programming algorithm. It produces up to 99% accuracy on realistic data and can process ciphertexts at 200ms per byte on a $2,000 PC. To further demonstrate the practical effectiveness of the method, we show that our tool can recover documents encrypted by Microsoft Word 2002 [22].