Empirical bayes screening for multi-item associations
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Predicting clicks: estimating the click-through rate for new ads
Proceedings of the 16th international conference on World Wide Web
Hierarchical maximum entropy density estimation
Proceedings of the 24th international conference on Machine learning
Estimating rates of rare events at multiple resolutions
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Matchbox: large scale online bayesian recommendations
Proceedings of the 18th international conference on World wide web
Feature hashing for large scale multitask learning
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Regression-based latent factor models
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
MapReduce: a flexible data processing tool
Communications of the ACM - Amir Pnueli: Ahead of His Time
Latent OLAP: data cubes over latent variables
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
The sum of its parts: reducing sparsity in click estimation with query segments
Information Retrieval
Response prediction using collaborative filtering with hierarchies and side-information
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Real-time bidding algorithms for performance-based display ad allocation
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Temporal multi-hierarchy smoothing for estimating rates of rare events
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Post-click conversion modeling and analysis for non-guaranteed delivery display advertising
Proceedings of the fifth ACM international conference on Web search and data mining
Personalized click model through collaborative filtering
Proceedings of the fifth ACM international conference on Web search and data mining
Estimating conversion rate in display advertising from past erformance data
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Multimedia features for click prediction of new ads in display advertising
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Position-normalized click prediction in search advertising
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Traffic quality based pricing in paid search using two-stage regression
Proceedings of the 22nd international conference on World Wide Web companion
Scalable supervised dimensionality reduction using clustering
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
CTR prediction for contextual advertising: learning-to-rank approach
Proceedings of the Seventh International Workshop on Data Mining for Online Advertising
Real time bid optimization with smooth budget delivery in online advertising
Proceedings of the Seventh International Workshop on Data Mining for Online Advertising
Forecasting user visits for online display advertising
Information Retrieval
Predicting response in mobile advertising with hierarchical importance-aware factorization machine
Proceedings of the 7th ACM international conference on Web search and data mining
LASER: a scalable response prediction platform for online advertising
Proceedings of the 7th ACM international conference on Web search and data mining
Hi-index | 0.00 |
We consider the problem of estimating rates of rare events for high dimensional, multivariate categorical data where several dimensions are hierarchical. Such problems are routine in several data mining applications including computational advertising, our main focus in this paper. We propose LMMH, a novel log-linear modeling method that scales to massive data applications with billions of training records and several million potential predictors in a map-reduce framework. Our method exploits correlations in aggregates observed at multiple resolutions when working with multiple hierarchies; stable estimates at coarser resolution provide informative prior information to improve estimates at finer resolutions. Other than prediction accuracy and scalability, our method has an inbuilt variable screening procedure based on a "spike and slab prior" that provides parsimony by removing non-informative predictors without hurting predictive accuracy. We perform large scale experiments on data from real computational advertising applications and illustrate our approach on datasets with several billion records and hundreds of millions of predictors. Extensive comparisons with other benchmark methods show significant improvements in prediction accuracy.