Finding frequent items in parallel

  • Authors:
  • Massimo Cafaro;Piergiulio Tempesta

  • Affiliations:
  • Department of Innovation Engineering, University of Salento, Via per Monteroni, 73100 Lecce, Italy and CMCC—Euro-Mediterranean Centre for Climate Change, Lecce, Italy;Departamento de Física Teórica II, Facultad de Físicas, Universidad Complutense, 28040 Madrid, Spain

  • Venue:
  • Concurrency and Computation: Practice & Experience
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a deterministic parallel algorithm for the k-majority problem, that can be used to find in parallel frequent items, i.e. those whose multiplicity is greater than a given threshold, and is therefore useful to process iceberg queries and in many other different contexts of applied mathematics and information theory. The algorithm can be used both in the online (stream) context and in the offline setting, the difference being that in the former case we are restricted to a single scan of the input elements, so that verifying the frequent items that have been determined is not allowed (e.g. network traffic streams passing through internet routers), while in the latter a parallel scan of the input can be used to determine the actual k-majority elements. To the best of our knowledge, this is the first parallel algorithm solving the proposed problem. Copyright © 2011 John Wiley & Sons, Ltd.