A natural language approach to automated cryptanalysis of two-time pads

Authors:
Joshua Mason;Kathryn Watkins;Jason Eisner;Adam Stubblefield
Affiliations:
Johns Hopkins University;Johns Hopkins University;Johns Hopkins University;Johns Hopkins University
Venue:
Proceedings of the 13th ACM conference on Computer and communications security
Year:
2006

Citing 9
Cited 2

Intercepting mobile communications: the insecurity of 802.11

Proceedings of the 7th annual international conference on Mobile computing and networking
Applied Cryptography: Protocols, Algorithms, and Source Code in C

Applied Cryptography: Protocols, Algorithms, and Source Code in C
Introduction To Automata Theory, Languages, And Computation

Introduction To Automata Theory, Languages, And Computation
Substitution Deciphering Based on HMMs with Applications to Compressed Document Processing

IEEE Transactions on Pattern Analysis and Machine Intelligence
Attacking and repairing the winZip encryption scheme

Proceedings of the 11th ACM conference on Computer and communications security
Fast dictionary attacks on passwords using time-space tradeoff

Proceedings of the 12th ACM conference on Computer and communications security
Keyboard acoustic emanations revisited

Proceedings of the 12th ACM conference on Computer and communications security
Timing analysis of keystrokes and timing attacks on SSH

SSYM'01 Proceedings of the 10th conference on USENIX Security Symposium - Volume 10
Scaling high-order character language models to gigabytes

Software '05 Proceedings of the Workshop on Software

When cryptography meets storage

Proceedings of the 4th ACM international workshop on Storage security and survivability
Crypt analysis of two time pads in case of compressed speech

Computers and Electrical Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

While keystream reuse in stream ciphers and one-time pads has been a well known problem for several decades, the risk to real systems has been underappreciated. Previous techniques have relied on being able to accurately guess words and phrases that appear in one of the plaintext messages, making it far easier to claim that "an attacker would never be able to do that." In this paper, we show how an adversary can automatically recover messages encrypted under the same keystream if only the type of each message is known (e.g. an HTML page in English). Our method, which is related to HMMs, recovers the most probable plaintext of this type by using a statistical language model and a dynamic programming algorithm. It produces up to 99% accuracy on realistic data and can process ciphertexts at 200ms per byte on a $2,000 PC. To further demonstrate the practical effectiveness of the method, we show that our tool can recover documents encrypted by Microsoft Word 2002 [22].