A fast algorithm for computing multiplicative inverses in GF(2m) using normal bases
Information and Computation
New Systolic Architectures for Inversion and Division in GF(2^m)
IEEE Transactions on Computers
FPGA implementation of a high speed network interface card for optical burst switched networks
FPGA '04 Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
Error Control Coding, Second Edition
Error Control Coding, Second Edition
Trading structure for randomness in wireless opportunistic routing
Proceedings of the 2007 conference on Applications, technologies, architectures, and protocols for computer communications
Fast elliptic curve cryptography on FPGA
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
FPGA implementation(s) of a scalable encryption algorithm
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Symbol-level network coding for wireless mesh networks
Proceedings of the ACM SIGCOMM 2008 conference on Data communication
Random network coding on the iPhone: fact or fiction?
Proceedings of the 18th international workshop on Network and operating systems support for digital audio and video
Distributed source coding for satellite communications
IEEE Transactions on Information Theory
IEEE Transactions on Information Theory
A Random Linear Network Coding Approach to Multicast
IEEE Transactions on Information Theory
Design and evaluation of random linear network coding Accelerators on FPGAs
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 0.00 |
Decoding operation is one of the major performance bottlenecks in network coding applications. To address the problem caused by decoding delay, this paper proposes high-performance decoding logic on the field-programmable gate-array (FPGA). A Galois Field arithmetic logic unit (GF ALU) is implemented with a full parallelization. We claim that the complexity of hardware is reduced by use of the log and anti-log tables. In addition, the fast arithmetic operation is achieved by the parallelized GF ALU architecture, which allows one-row-calculations of a matrix to be performed concurrently. The decoders for four different sizes of the coefficient matrix have been implemented while the degree of parallelism is preserved for each size. The performance is evaluated by comparing with the performance of the decoding operation both on the ARM processor emulator and a real ARM processor. Using a modern Xilinx Virtex-5 device, the decoding time of 3.5 ms for the size 16 x 16 and 190.5 ms for 128 x 128 has been achieved at the operating frequency of 50MHz, which is equal to 12.7 and 21.7 in terms of speedup.