Classification of proxy labeled examples for marketing segment generation

  • Authors:
  • Dean Cerrato;Rosie Jones;Avinash Gupta

  • Affiliations:
  • Akamai Technologies, Cambridge, MA, USA;Akamai Technologies, Cambridge, MA, USA;Akamai Technologies, Cambridge, MA, USA

  • Venue:
  • Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Marketers often rely on a set of descriptive segments, or qualitative subsets of the population, to specify the audiences of targeted advertising campaigns. For example, the descriptive segment "Empty Nesters" might describe a desirable target audience for extended vacation package offers. While some segments may be easily described and generated using demographic data as ground truth, others such as "Soccer Moms" or "Urban Hipsters" reflect a combination of demographic and behavioral attributes. Ideally, these attributes would be available as the basis for ground truth labeling of a classifier training set or even direct member selection from the population. Unfortunately, ground truth attributes are often scarce or unavailable, in which case a proxy labeling scheme is needed. We devise a method for labeling a population according to criteria based on a postulated set of shopping behaviors specific to a descriptive segment. We then perform supervised binary classification on this labeled dataset in order to discover additional identifying patterns of behavior typical of labeled positives in the population. Finally, the resulting classifier is used to perform selection from the population into the segment, extending reach to cookies who may not have exhibited the postulated behaviors but likely belong in the segment. We validate our approach by comparing a descriptive segment trained on ground truth to one trained on behavioral attributes only. We show that our behavior-based approach produces classifiers having performance comparable to that of a classifier trained on the ground truth data.