Optimized GPU implementation and performance analysis of HC series of stream ciphers

Authors:
Ayesha Khalid;Deblin Bagchi;Goutam Paul;Anupam Chattopadhyay
Affiliations:
Institute for Communication Technologies and Embedded Systems, RWTH Aachen University, Aachen, Germany;Department of Computer Science and Engineering, Jadavpur University, Kolkata, India;Department of Computer Science and Engineering, Jadavpur University, Kolkata, India;Institute for Communication Technologies and Embedded Systems, RWTH Aachen University, Aachen, Germany
Venue:
ICISC'12 Proceedings of the 15th international conference on Information Security and Cryptology
Year:
2012

Citing 12
Cited 0

The Key and IV Setup of the Stream Ciphers HC-256 and HC-128

NSWCTC '09 Proceedings of the 2009 International Conference on Networks Security, Wireless Communications and Trusted Computing - Volume 02
Design of a parallel AES for graphics hardware using the CUDA framework

IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
A Cache Timing Analysis of HC-256

Selected Areas in Cryptography
Improved Distinguishing Attacks on HC-256

IWSEC '09 Proceedings of the 4th International Workshop on Security: Advances in Information and Computer Security
A Program Behavior Study of Block Cryptography Algorithms on GPGPU

FCST '09 Proceedings of the 2009 Fourth International Conference on Frontier of Computer Science and Technology
On the importance of checking cryptographic protocols for faults

EUROCRYPT'97 Proceedings of the 16th annual international conference on Theory and application of cryptographic techniques
Some observations on HC-128

Designs, Codes and Cryptography
CudaDMA: optimizing GPU memory bandwidth via warp specialization

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
A theoretical analysis of the structure of HC-128

IWSEC'11 Proceedings of the 6th International conference on Advances in information and computer security
Cache attacks and countermeasures: the case of AES

CT-RSA'06 Proceedings of the 2006 The Cryptographers' Track at the RSA conference on Topics in Cryptology
Improved distinguishers for HC-128

Designs, Codes and Cryptography
Differential fault analysis of HC-128

AFRICACRYPT'10 Proceedings of the Third international conference on Cryptology in Africa

Quantified Score

Hi-index	0.00

Visualization

Abstract

The ease of programming offered by the CUDA programming model attracted a lot of programmers to try the platform for acceleration of many non-graphics applications. Cryptography, being no exception, also found its share of exploration efforts, especially block ciphers. In this contribution we present a detailed walk-through of effective mapping of HC-128 and HC-256 stream ciphers on GPUs. Due to inherent inter-S-Box dependencies, intra-S-Box dependencies and a high number of memory accesses per keystream word generation, parallelization of HC series of stream ciphers remains challenging. For the first time, we present various optimization strategies for HC-128 and HC-256 speedup in tune with CUDA device architecture. The peak performance achieved with a single data-stream for HC-128 and HC-256 is 0.95 Gbps and 0.41 Gbps respectively. Although these throughput figures do not beat the CPU performance (10.9 Gbps for HC-128 and 7.5 Gbps for HC-256), our multiple parallel data-stream implementation is benchmarked to reach approximately 31 Gbps for HC-128 and 14 Gbps for HC-256 (with 32768 parallel data-streams). To the best of our knowledge, this is the first reported effort of mapping HC-Series of stream ciphers on GPUs.