Communications of the ACM
Evaluating MMX technology using DSP and multimedia applications
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Evaluating Micro-Processors Multimedia Extensions for the Real-Time Simulation of RBF Networks
MICRONEURO '99 Proceedings of the 7th International Conference on Microelectronics for Neural, Fuzzy and Bio-Inspired Systems
Experiments in parallel matrix multiplication on multi-core systems
ICA3PP'12 Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
Hi-index | 0.00 |
The MMX and SSE extensions of current Intel Pentium processors offer a 4-way or 8-way SIMD parallelism to accelerate many vector or matrix applications. In this paper the performance of MMX and SSE for the implementation of neural networks is evaluated. It is shown that a speedup in the range from 1.3 to 9.8 for single neural operations and a total speedup of up to 4.1 for the simulation of a complete neural network can be achieved. A detailed performance counter analysis is provided.