High Efficiency Counter Mode Security Architecture via Prediction and Precomputation

Authors:
Weidong Shi;Hsien-Hsin S. Lee;Mrinmoy Ghosh;Chenghuai Lu;Alexandra Boldyreva
Affiliations:
Georgia Institute of Technology;Georgia Institute of Technology;Georgia Institute of Technology;Georgia Institute of Technology;Georgia Institute of Technology
Venue:
Proceedings of the 32nd annual international symposium on Computer Architecture
Year:
2005

Citing 16
Cited 27

Reducing memory latency via non-blocking and prefetching caches

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Value locality and load value prediction

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Data prefetch mechanisms

ACM Computing Surveys (CSUR)
Architectural support for copy and tamper resistant software

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Frequent value locality and its applications

ACM Transactions on Embedded Computing Systems (TECS)
Automatically characterizing large scale program behavior

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Effective Hardware-Based Data Prefetching for High-Performance Processors

IEEE Transactions on Computers
Hiding program slices for software security

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
AEGIS: architecture for tamper-evident and tamper-resistant processing

ICS '03 Proceedings of the 17th annual international conference on Supercomputing
A Concrete Security Treatment of Symmetric Encryption

FOCS '97 Proceedings of the 38th Annual Symposium on Foundations of Computer Science
Guided region prefetching: a cooperative hardware/software approach

Proceedings of the 30th annual international symposium on Computer architecture
A secure and reliable bootstrap architecture

SP '97 Proceedings of the 1997 IEEE Symposium on Security and Privacy
Implementing an untrusted operating system on trusted hardware

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Efficient Memory Integrity Verification and Encryption for Secure Processors

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Fast Secure Processor for Inhibiting Software Piracy and Tampering

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
PointguardTM: protecting pointers from buffer overflow vulnerabilities

SSYM'03 Proceedings of the 12th conference on USENIX Security Symposium - Volume 12

Improving Cost, Performance, and Security of Memory Encryption and Authentication

Proceedings of the 33rd annual international symposium on Computer Architecture
A low-cost memory remapping scheme for address bus protection

Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Efficient data protection for distributed shared memory multiprocessors

Proceedings of the 15th international conference on Parallel architectures and compilation techniques
SecCMP: a secure chip-multiprocessor architecture

Proceedings of the 1st workshop on Architectural and system support for improving software dependability
Authentication Control Point and Its Implications For Secure Processor Design

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Accelerating memory decryption and authentication with frequent value prediction

Proceedings of the 4th international conference on Computing frontiers
Addressing instruction fetch bottlenecks by using an instruction register file

Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Making secure processors OS- and performance-friendly

ACM Transactions on Architecture and Code Optimization (TACO)
Memory-Centric Security Architecture

Transactions on High-Performance Embedded Architectures and Compilers I
SHARK: Architectural support for autonomic protection against stealth by rootkit exploits

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Compiler-Assisted Memory Encryption for Embedded Processors

Transactions on High-Performance Embedded Architectures and Compilers II
On protecting integrity and confidentiality of cryptographic file system for outsourced storage

Proceedings of the 2009 ACM workshop on Cloud computing security
A low-cost memory remapping scheme for address bus protection

Journal of Parallel and Distributed Computing
Compiler-assisted memory encryption for embedded processors

HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
Secure cryptographic precomputation with insecure memory

ISPEC'08 Proceedings of the 4th international conference on Information security practice and experience
SHIELDSTRAP: making secure processors truly secure

ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
A session key caching and prefetching scheme for secure communication in cluster systems

Journal of Parallel and Distributed Computing
An analysis of secure processor architectures

Transactions on computational science VII
The modulo-10 partition counter

ICC'06 Proceedings of the 10th WSEAS international conference on Circuits
Green secure processors: towards power-efficient secure processor design

Transactions on computational science X
SecureME: a hardware-software approach to full system security

Proceedings of the international conference on Supercomputing
DynaPoMP: dynamic policy-driven memory protection for SPM-based embedded systems

WESS '11 Proceedings of the Workshop on Embedded Systems Security
Memory-centric security architecture

HiPEAC'05 Proceedings of the First international conference on High Performance Embedded Architectures and Compilers
Memory encryption for smart cards

CARDIS'11 Proceedings of the 10th IFIP WG 8.8/11.2 international conference on Smart Card Research and Advanced Applications
Bus and memory protection through chain-generated and tree-verified IV for multiprocessors systems

Future Generation Computer Systems
Multi-processor architectural support for protecting virtual machine privacy in untrusted cloud environment

Proceedings of the ACM International Conference on Computing Frontiers
An efficient run-time encryption scheme for non-volatile main memory

Proceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Encrypting data in unprotected memory has gained much interest lately for digital rights protection and security reasons. Counter Mode is a well-known encryption scheme. It is a symmetric-key encryption scheme based on any block cipher, e.g. AES. The schemeýs encryption algorithm uses a block cipher, a secret key and a counter (or a sequence number) to generate an encryption pad which is XORed with the data stored in memory. Like other memory encryption schemes, this method suffers from the inherent latency of decrypting encrypted data when loading them into the on-chip cache. One solution that parallelizes data fetching and encryption pad generation requires the sequence numbers of evicted cache lines to be cached on-chip. On-chip sequence number caching can be successful in reducing the latency at the cost of a large area overhead. In this paper, we present a novel technique to hide the latency overhead of decrypting counter mode encrypted memory by predicting the sequence number and pre-computing the encryption pad that we call one-time-pad or OTP. In contrast to the prior techniques of sequence number caching, our mechanism solves the latency issue by using idle decryption engine cycles to speculatively predict and pre-compute OTPs before the corresponding sequence number is loaded. This technique incurs very little area overhead. In addition, a novel adaptive OTP prediction technique is also presented to further improve our regular OTP prediction and precomputation mechanism. This adaptive scheme is not only able to predict encryption pads associated with static and infrequently updated cache lines but also those frequently updated ones as well. Experimental results using SPEC2000 benchmark show an 82% prediction rate. Moreover, we also explore several optimization techniques for improving the prediction accuracy. Two specific techniques, Two-level prediction and Context-based prediction are presented and evaluated. For the two-level prediction, the prediction rate was improved from 82% to 96%. With the context-based prediction, the prediction rate approaches 99%. Context-based OTP prediction outperforms a very large 512KB sequence number cache for many memory-bound SPEC programs. IPC results show an overall 15% to 40% performance improvement using our prediction and precomputation, and another 7% improvement when context-based prediction techniques is used.