Efficient Geometric Algorithms on the EREW PRAM
IEEE Transactions on Parallel and Distributed Systems
Parallel programming with MPI
Proceedings of the 1998 conference on Advances in neural information processing systems II
PSBLAS: a library for parallel linear algebra computation on sparse matrices
ACM Transactions on Mathematical Software (TOMS)
Co-clustering documents and words using bipartite spectral graph partitioning
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Query clustering using user logs
ACM Transactions on Information Systems (TOIS)
Document clustering based on non-negative matrix factorization
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Non-negative Matrix Factorization with Sparseness Constraints
The Journal of Machine Learning Research
Fast maximum margin matrix factorization for collaborative prediction
ICML '05 Proceedings of the 22nd international conference on Machine learning
Improving web search ranking by incorporating user behavior information
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Orthogonal nonnegative matrix t-factorizations for clustering
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Google news personalization: scalable online collaborative filtering
Proceedings of the 16th international conference on World Wide Web
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Predictive discrete latent factor models for large scale dyadic data
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
ICDMW '07 Proceedings of the Seventh IEEE International Conference on Data Mining Workshops
Fast Projection-Based Methods for the Least Squares Nonnegative Matrix Approximation Problem
Statistical Analysis and Data Mining
Fully distributed EM for very large datasets
Proceedings of the 25th international conference on Machine learning
BrowseRank: letting web users vote for page importance
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Learning tag relevance by neighbor voting for social image retrieval
MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
Collaborative filtering for orkut communities: discovery of user latent behavior
Proceedings of the 18th international conference on World wide web
Large-scale behavioral targeting
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Fast nonparametric matrix factorization for large-scale collaborative filtering
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Document clustering using nonnegative matrix factorization
Information Processing and Management: an International Journal
Query recommendation using query logs in search engines
EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology
RecLab: a system for eCommerce recommender research with real data, context and feedback
Proceedings of the 2011 Workshop on Context-awareness in Retrieval and Recommendation
Regularized latent semantic indexing
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Large-scale matrix factorization with distributed stochastic gradient descent
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Factorization-based lossless compression of inverted indices
Proceedings of the 20th ACM international conference on Information and knowledge management
MadLINQ: large-scale distributed matrix computation for the cloud
Proceedings of the 7th ACM european conference on Computer Systems
Memory-restricted latent semantic analysis to accumulate term-document co-occurrence events
Pattern Recognition Letters
GigaTensor: scaling tensor analysis up by 100 times - algorithms and discoveries
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Large-scale distributed non-negative sparse coding and sparse dictionary learning
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
MapReduce for parallel reinforcement learning
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
MapReduce algorithms for big data analysis
Proceedings of the VLDB Endowment
Regularized Latent Semantic Indexing: A New Approach to Large-Scale Topic Modeling
ACM Transactions on Information Systems (TOIS)
Online projective nonnegative matrix factorization for large datasets
ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part III
Sparkler: supporting large-scale matrix factorization
Proceedings of the 16th International Conference on Extending Database Technology
A general collaborative filtering framework based on matrix bordered block diagonal forms
Proceedings of the 24th ACM Conference on Hypertext and Social Media
Improve collaborative filtering through bordered block diagonal form matrices
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Making recommendations from multiple domains
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Distributed large-scale natural graph factorization
Proceedings of the 22nd international conference on World Wide Web
Localized matrix factorization for recommendation based on matrix block diagonal forms
Proceedings of the 22nd international conference on World Wide Web
Non-negative multiple matrix factorization
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Cost-Aware Collaborative Filtering for Travel Tour Recommendations
ACM Transactions on Information Systems (TOIS)
Partial-update dimensionality reduction for accumulating co-occurrence events
Pattern Recognition Letters
Comment-based multi-view clustering of web 2.0 items
Proceedings of the 23rd international conference on World wide web
Hi-index | 0.00 |
The Web abounds with dyadic data that keeps increasing by every single second. Previous work has repeatedly shown the usefulness of extracting the interaction structure inside dyadic data [21, 9, 8]. A commonly used tool in extracting the underlying structure is the matrix factorization, whose fame was further boosted in the Netflix challenge [26]. When we were trying to replicate the same success on real-world Web dyadic data, we were seriously challenged by the scalability of available tools. We therefore in this paper report our efforts on scaling up the nonnegative matrix factorization (NMF) technique. We show that by carefully partitioning the data and arranging the computations to maximize data locality and parallelism, factorizing a tens of millions by hundreds of millions matrix with billions of nonzero cells can be accomplished within tens of hours. This result effectively assures practitioners of the scalability of NMF on Web-scale dyadic data.