Estimating Aggregates over Multiple Sets

  • Authors:
  • Edith Cohen;Haim Kaplan

  • Affiliations:
  • -;-

  • Venue:
  • ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

Many datasets, including market basket data, text or hypertext documents, and measurement data collected in different nodes or time periods, are modeled as a collection of sets over a ground set of (weighted) items. We consider the problem of estimating basic aggregates such as the weight or selectivity of a subpopulation of the items. We extend classic summarization techniques based on sampling to this scenario when we have multiple sets and selection predicates based on membership in particular sets.