Estimating top-k destinations in data streams

  • Authors:
  • Nuno Homem;Joao Paulo Carvalho

  • Affiliations:
  • TULisbon, Instituto Superior Técnico, INESC-ID, Lisboa, Portugal;TULisbon, Instituto Superior Técnico, INESC-ID, Lisboa, Portugal

  • Venue:
  • IPMU'10 Proceedings of the Computational intelligence for knowledge-based systems design, and 13th international conference on Information processing and management of uncertainty
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

One considers the problem of estimating the most frequent values in a data stream. In many cases an approximate answer may be enough. A novel algorithm is presented to approximate the most frequent values using a mixed approach between counter-based techniques and sketch-based ones. The algorithm is then used to find the most frequent destinations of calls by individual customers of telecommunications operators. The use of fast and small footprint algorithms is critical due to the huge number of customers to check and approximate answers are enough in most situations. The problem is that such detection needs to be performed for each individual customer and kept up to date at all times. This paper presents telecommunications customer's behavior to justify the use of approximate algorithms. Although used in this paper on telecommunications this algorithm may well be used in other contexts.