A highly efficient implementation of back propagation algorithm using matrix instruction set architecture

  • Authors:
  • Mostafa I. Soliman;Samir A. Mohamed

  • Affiliations:
  • Computer & Control Section, Electrical Engineering Department, Faculty of Engineering, South Valey University, Aswan, Arab Republic of Egypt;Computer & Control Section, Electrical Engineering Department, Faculty of Engineering, South Valey University, Aswan, Arab Republic of Egypt

  • Venue:
  • Neural, Parallel & Scientific Computations
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Back Propagation (BP) training algorithm has received intensive research efforts to exploit its parallelism in order to reduce the training time for complex problems. A modified version of BP based on matrix-matrix multiplication was proposed for parallel processing. This paper discusses the implementation of Matrix Back Propagation (MBP) using scalar, vector, and matrix instruction set architecture (ISA). Besides, it shows that the performance of the MBP is improved by switching form scalar to vector ISA and form vector to matrix ISA. On a practical application, speech recognition, the speedup of training a neural network using unrolling scalar over scalar ISA is 1.83. On eight parallel lanes, the speedup of using vector, unrolling vector, and matrix ISA are respectively 10.33, 11.88, and 15.36, where the maximum theoretical speedup is 16. Our results show that the use of matrix ISA gives a performance close to the optimal because of reusing the loaded data, decreasing the loop overhead, and overlapping the memory operations by arithmetic operations.