Using GPUs for Machine Learning Algorithms
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Revisiting the Sequential Programming Model for Multi-Core
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Large-scale deep unsupervised learning using graphics processors
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness
Proceedings of the 36th annual international symposium on Computer architecture
On the limits of GPU acceleration
HotPar'10 Proceedings of the 2nd USENIX conference on Hot topics in parallelism
Hi-index | 0.00 |
On-line Machine Learning using Stochastic Gradient Descent is an inherently sequential computation. This makes it difficult to improve performance by simply employing parallel architectures. Langford et al. made a modification to the standard stochastic gradient descent approach which opens up the possibility of parallel computation. They also proved that there is no significant loss in accuracy in their approach. They did empirically demonstrate the performance gain in speed for the case of a pipelined architecture with a few processing units. In this paper we report on applying the Langford et al. approach on a General Purpose Graphics Processing Unit (GPGPU) with a large number of processing units. We accelerate the learning speed by approximately 4.5 times compared to a standard single threaded approach with comparable accuracy. We also evaluate the GPU performance for the sequential variant of the algorithm, which has not previously been reported. Finally, we investigate how changes in the number of threads, number of blocks, and amount of delay, effects the overall performance and accuracy.