A limited memory algorithm for bound constrained optimization
SIAM Journal on Scientific Computing
Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
Google news personalization: scalable online collaborative filtering
Proceedings of the 16th international conference on World Wide Web
Large-Scale Parallel Collaborative Filtering for the Netflix Prize
AAIM '08 Proceedings of the 4th international conference on Algorithmic Aspects in Information and Management
A Unified View of Matrix Factorization Models
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Distributed nonnegative matrix factorization for web-scale dyadic data analysis on mapreduce
Proceedings of the 19th international conference on World wide web
Ricardo: integrating R and Hadoop
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Distributed training strategies for the structured perceptron
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Pushing the boundaries of crowd-enabled databases with query-driven schema expansion
Proceedings of the VLDB Endowment
Towards a unified architecture for in-RDBMS analytics
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Unexpected challenges in large scale machine learning
Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
Scalable similarity-based neighborhood methods with MapReduce
Proceedings of the sixth ACM conference on Recommender systems
Discovering latent factors from movies genres for enhanced recommendation
Proceedings of the sixth ACM conference on Recommender systems
Sparkler: supporting large-scale matrix factorization
Proceedings of the 16th International Conference on Extending Database Technology
Big graph mining: algorithms and discoveries
ACM SIGKDD Explorations Newsletter
A general collaborative filtering framework based on matrix bordered block diagonal forms
Proceedings of the 24th ACM Conference on Hypertext and Social Media
Improve collaborative filtering through bordered block diagonal form matrices
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Scalable I/O-bound parallel incremental gradient descent for big data analytics in GLADE
Proceedings of the Second Workshop on Data Analytics in the Cloud
FISM: factored item similarity models for top-N recommender systems
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Distributed large-scale natural graph factorization
Proceedings of the 22nd international conference on World Wide Web
SoCo: a social network aided context-aware recommender system
Proceedings of the 22nd international conference on World Wide Web
Localized matrix factorization for recommendation based on matrix block diagonal forms
Proceedings of the 22nd international conference on World Wide Web
"All roads lead to Rome": optimistic recovery for distributed iterative data processing
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
A fast parallel SGD for matrix factorization in shared memory systems
Proceedings of the 7th ACM conference on Recommender systems
Distributed matrix factorization with mapreduce using a series of broadcast-joins
Proceedings of the 7th ACM conference on Recommender systems
Scalable mining of social data using stochastic gradient fisher scoring
Proceedings of the 2013 workshop on Data-driven user behavioral modelling and mining from social media
Pessimists and optimists: Improving collaborative filtering through sentiment analysis
Expert Systems with Applications: An International Journal
iGSLR: personalized geo-social location recommendation: a kernel density estimation approach
Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
A distributed algorithm for large-scale generalized matching
Proceedings of the VLDB Endowment
A platform for eXtreme analytics
IBM Journal of Research and Development
Hi-index | 0.00 |
We provide a novel algorithm to approximately factor large matrices with millions of rows, millions of columns, and billions of nonzero elements. Our approach rests on stochastic gradient descent (SGD), an iterative stochastic optimization algorithm. We first develop a novel "stratified" SGD variant (SSGD) that applies to general loss-minimization problems in which the loss function can be expressed as a weighted sum of "stratum losses." We establish sufficient conditions for convergence of SSGD using results from stochastic approximation theory and regenerative process theory. We then specialize SSGD to obtain a new matrix-factorization algorithm, called DSGD, that can be fully distributed and run on web-scale datasets using, e.g., MapReduce. DSGD can handle a wide variety of matrix factorizations. We describe the practical techniques used to optimize performance in our DSGD implementation. Experiments suggest that DSGD converges significantly faster and has better scalability properties than alternative algorithms.