Stochastic gradient descent with GPGPU

Authors:
David Zastrau;Stefan Edelkamp
Affiliations:
Faculty 3--Mathematics and Computer Science, University of Bremen, Bremen, Germany;Faculty 3--Mathematics and Computer Science, University of Bremen, Bremen, Germany
Venue:
KI'12 Proceedings of the 35th Annual German conference on Advances in Artificial Intelligence
Year:
2012

Citing 5
Cited 0

An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Fast support vector machine training and classification on graphics processors

Proceedings of the 25th international conference on Machine learning
Matrix factorization and neighbor based algorithms for the netflix prize problem

Proceedings of the 2008 ACM conference on Recommender systems
Matrix Factorization Techniques for Recommender Systems

Computer
Statistical Learning and Data Science

Statistical Learning and Data Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

We show how to optimize a Support Vector Machine and a predictor for Collaborative Filtering with Stochastic Gradient Descent on the GPU, achieving 1.66 to 6-times accelerations compared to a CPU-based implementation. The reference implementations are the Support Vector Machine by Bottou and the BRISMF predictor from the Netflix Prices winning team. Our main idea is to create a hash function of the input data and use it to execute threads in parallel that write on different elements of the parameter vector. We also compare the iterative optimization with a batch gradient descent and an alternating least squares optimization. The predictor is tested against over a hundred million data sets which demonstrates the increasing memory management capabilities of modern GPUs. We make use of matrix as well as float compression to alleviate the memory bottleneck.