Hi-index | 0.00 |
Scalable Montgomery modular multiplier is composed of a queue of processing elements, and the total computation time is proportional to the latency between such elements. By a feedforward architecture proposed by Huang et al., the latency can be brought down from 2 clock cycles to 1 clock cycle. This paper presents both radix-2 and radix-4 CSA-based designs of the new architecture, and by Booth coding and the auxiliary coding the radix-4 design is faster than superior to the radix-2 design in terms of Time脳Area.