User-defined aggregate functions: bridging theory and practice
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
The DataPath system: a data-centric analytic processing engine for large data warehouses
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
HaLoop: efficient iterative data processing on large clusters
Proceedings of the VLDB Endowment
Large-scale matrix factorization with distributed stochastic gradient descent
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
GLADE: a scalable framework for efficient analytics
ACM SIGOPS Operating Systems Review
Towards a unified architecture for in-RDBMS analytics
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
GLADE: big data analytics made easy
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
The MADlib analytics library: or MAD skills, the SQL
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
Incremental gradient descent is a general technique to solve a large class of convex optimization problems arising in many machine learning tasks. GLADE is a parallel infrastructure for big data analytics providing a generic task specification interface. In this paper, we present a scalable and efficient parallel solution for incremental gradient descent in GLADE. We provide empirical evidence that our solution is limited only by the physical hardware characteristics, uses effectively the available resources, and achieves maximum scalability. When deployed in the cloud, our solution has the potential to dramatically reduce the cost of complex analytics over massive datasets.