Reducing memory latency via non-blocking and prefetching caches
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Value locality and load value prediction
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
ACM Computing Surveys (CSUR)
Architectural support for copy and tamper resistant software
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Frequent value locality and its applications
ACM Transactions on Embedded Computing Systems (TECS)
Automatically characterizing large scale program behavior
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Effective Hardware-Based Data Prefetching for High-Performance Processors
IEEE Transactions on Computers
Hiding program slices for software security
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
AEGIS: architecture for tamper-evident and tamper-resistant processing
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
A Concrete Security Treatment of Symmetric Encryption
FOCS '97 Proceedings of the 38th Annual Symposium on Foundations of Computer Science
Guided region prefetching: a cooperative hardware/software approach
Proceedings of the 30th annual international symposium on Computer architecture
A secure and reliable bootstrap architecture
SP '97 Proceedings of the 1997 IEEE Symposium on Security and Privacy
Implementing an untrusted operating system on trusted hardware
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Efficient Memory Integrity Verification and Encryption for Secure Processors
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Fast Secure Processor for Inhibiting Software Piracy and Tampering
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
PointguardTM: protecting pointers from buffer overflow vulnerabilities
SSYM'03 Proceedings of the 12th conference on USENIX Security Symposium - Volume 12
Improving Cost, Performance, and Security of Memory Encryption and Authentication
Proceedings of the 33rd annual international symposium on Computer Architecture
A low-cost memory remapping scheme for address bus protection
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Efficient data protection for distributed shared memory multiprocessors
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
SecCMP: a secure chip-multiprocessor architecture
Proceedings of the 1st workshop on Architectural and system support for improving software dependability
Authentication Control Point and Its Implications For Secure Processor Design
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Accelerating memory decryption and authentication with frequent value prediction
Proceedings of the 4th international conference on Computing frontiers
Addressing instruction fetch bottlenecks by using an instruction register file
Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Making secure processors OS- and performance-friendly
ACM Transactions on Architecture and Code Optimization (TACO)
Memory-Centric Security Architecture
Transactions on High-Performance Embedded Architectures and Compilers I
SHARK: Architectural support for autonomic protection against stealth by rootkit exploits
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Compiler-Assisted Memory Encryption for Embedded Processors
Transactions on High-Performance Embedded Architectures and Compilers II
On protecting integrity and confidentiality of cryptographic file system for outsourced storage
Proceedings of the 2009 ACM workshop on Cloud computing security
A low-cost memory remapping scheme for address bus protection
Journal of Parallel and Distributed Computing
Compiler-assisted memory encryption for embedded processors
HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
Secure cryptographic precomputation with insecure memory
ISPEC'08 Proceedings of the 4th international conference on Information security practice and experience
SHIELDSTRAP: making secure processors truly secure
ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
A session key caching and prefetching scheme for secure communication in cluster systems
Journal of Parallel and Distributed Computing
An analysis of secure processor architectures
Transactions on computational science VII
The modulo-10 partition counter
ICC'06 Proceedings of the 10th WSEAS international conference on Circuits
Green secure processors: towards power-efficient secure processor design
Transactions on computational science X
SecureME: a hardware-software approach to full system security
Proceedings of the international conference on Supercomputing
DynaPoMP: dynamic policy-driven memory protection for SPM-based embedded systems
WESS '11 Proceedings of the Workshop on Embedded Systems Security
Memory-centric security architecture
HiPEAC'05 Proceedings of the First international conference on High Performance Embedded Architectures and Compilers
Memory encryption for smart cards
CARDIS'11 Proceedings of the 10th IFIP WG 8.8/11.2 international conference on Smart Card Research and Advanced Applications
Bus and memory protection through chain-generated and tree-verified IV for multiprocessors systems
Future Generation Computer Systems
Proceedings of the ACM International Conference on Computing Frontiers
An efficient run-time encryption scheme for non-volatile main memory
Proceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems
Hi-index | 0.00 |
Encrypting data in unprotected memory has gained much interest lately for digital rights protection and security reasons. Counter Mode is a well-known encryption scheme. It is a symmetric-key encryption scheme based on any block cipher, e.g. AES. The schemeýs encryption algorithm uses a block cipher, a secret key and a counter (or a sequence number) to generate an encryption pad which is XORed with the data stored in memory. Like other memory encryption schemes, this method suffers from the inherent latency of decrypting encrypted data when loading them into the on-chip cache. One solution that parallelizes data fetching and encryption pad generation requires the sequence numbers of evicted cache lines to be cached on-chip. On-chip sequence number caching can be successful in reducing the latency at the cost of a large area overhead. In this paper, we present a novel technique to hide the latency overhead of decrypting counter mode encrypted memory by predicting the sequence number and pre-computing the encryption pad that we call one-time-pad or OTP. In contrast to the prior techniques of sequence number caching, our mechanism solves the latency issue by using idle decryption engine cycles to speculatively predict and pre-compute OTPs before the corresponding sequence number is loaded. This technique incurs very little area overhead. In addition, a novel adaptive OTP prediction technique is also presented to further improve our regular OTP prediction and precomputation mechanism. This adaptive scheme is not only able to predict encryption pads associated with static and infrequently updated cache lines but also those frequently updated ones as well. Experimental results using SPEC2000 benchmark show an 82% prediction rate. Moreover, we also explore several optimization techniques for improving the prediction accuracy. Two specific techniques, Two-level prediction and Context-based prediction are presented and evaluated. For the two-level prediction, the prediction rate was improved from 82% to 96%. With the context-based prediction, the prediction rate approaches 99%. Context-based OTP prediction outperforms a very large 512KB sequence number cache for many memory-bound SPEC programs. IPC results show an overall 15% to 40% performance improvement using our prediction and precomputation, and another 7% improvement when context-based prediction techniques is used.