GPU-accelerated restricted boltzmann machine for collaborative filtering

Authors:
Xianggao Cai;Zhanpeng Xu;Guoming Lai;Chengwei Wu;Xiaola Lin
Affiliations:
School of Information Science and Technology, Sun Yat-sen University, Guangzhou, China;School of Information Science and Technology, Sun Yat-sen University, Guangzhou, China;Department of Computer Application and Technology, Hanshan Normal University, Chaozhou, China;School of Information Science and Technology, Sun Yat-sen University, Guangzhou, China;School of Information Science and Technology, Sun Yat-sen University, Guangzhou, China
Venue:
ICA3PP'12 Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
Year:
2012

Citing 10
Cited 0

Information processing in dynamical systems: foundations of harmony theory

Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
Training products of experts by minimizing contrastive divergence

Neural Computation
Improving the prediction accuracy of recommendation algorithms: Approaches anchored on human factors

Interacting with Computers
Restricted Boltzmann machines for collaborative filtering

Proceedings of the 24th international conference on Machine learning
Product recommendation approaches: Collaborative filtering via customer lifetime value and customer demands

Expert Systems with Applications: An International Journal
Semi-supervised learning of compact document representations with deep networks

Proceedings of the 25th international conference on Machine learning
A high-performance FPGA architecture for restricted boltzmann machines

Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Large-scale deep unsupervised learning using graphics processors

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
A Large-Scale Architecture for Restricted Boltzmann Machines

FCCM '10 Proceedings of the 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines
High-performance reconfigurable hardware architecture for restricted Boltzmann machines

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

Collaborative Filtering (CF) is an important technique for recommendation systems which model and analyzes the preferences of customers for giving reasonable advices. Recently, many applications based on Restricted Boltzmann Machine (RBM) have been developed for a large variety of learning problems. RBM-based model for Collaborative Filtering (RBM-CF) is able to deal with large scale data sets and obtains good recommendation performance. However, the computation of RBM becomes problematic when using large number of hidden features to improve the recommendation accuracy. Although RBM has great potential for parallelism, it is still a challenge to develop a parallel implementation of RBM-CF on GPU, since the data sets for CF are always large and sparse. In this paper, we propose a parallel implementation of RBM-CF on GPU using CUDA. We first present how to transform the computation of RBM-CF into matrix-based operation on GPU, and three CUDA kernels for sparse matrix-matrix multiplication to further improve the computational efficiency of RBM-CF for modeling large scale and sparse data sets. Experimental results show that significant speedups are achieved by our parallel implementation on GPU.