Distributed Big Advertiser Data Mining

  • Authors:
  • Ashish Bindra;Srinivasulu Pokuri;Krishna Uppala;Ankur Teredesai

  • Affiliations:
  • -;-;-;-

  • Venue:
  • ICDMW '12 Proceedings of the 2012 IEEE 12th International Conference on Data Mining Workshops
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Advertisers and big data mining experts alike are today are dealing with complex datasets of increasing variety (first and third party data), volume (events, impressions, clicks), and velocity (real time bidding). Creating predictive models to customize advertiser requirements and campaign analytics to show targeted ads to users who are most likely to convert has become increasingly challenging. Advertisers often group customers into a segment defined by a given set of demographic or behavioral attributes. Such segments are often very sparse. "Look-Alike Modeling" enables advertisers to enhance the target segment by using predictive models to expand the segment membership by assigning a probability score to users that did not explicitly belong to that segment based on the original segment definition. In this paper accompanied by the demo of a distributed platform, we describe a Look-Alike Modeling framework to expand segment membership using a novel high-dimensional distributed algorithm based on frequent pattern mining. We describe how the distributed algorithm is more efficient than traditional classification techniques that (a) require multiple passes over the dataset and (b) require both positive and negative class labels for training. Our solution is capable of concurrently and continuously processing thousands of segments and includes an efficient grouping operator and a distributed scoring algorithm for predicting multiple segment membership for a given (very large) set of users. This leverages the power of in-database analytics as compared to using standard data mining libraries and is currently deployed on a real-world highly scalable distributed columnar database that powers several hundred campaigns and processes look-alike models for large online display advertisers. The results from the study demonstrate that the proposed algorithm outperforms other comparable techniques for predicting and expanding segments.