Mining frequent itemsets in data streams within a time horizon

  • Authors:
  • Luigi Troiano;Giacomo Scibelli

  • Affiliations:
  • -;-

  • Venue:
  • Data & Knowledge Engineering
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present an algorithm for mining frequent itemsets in a stream of transactions within a limited time horizon. In contrast to other approaches that are presented in the literature, the proposed algorithm makes use of a test window that can discard non-frequent itemsets from a set of candidates. The efficiency of this approach relies on the property that the higher the support threshold is, the smaller the test window is. In addition to considering a sharp horizon, we consider a smooth window. Indeed, in many applications that are of practical interest, not all of the time slots have the same relevance, e.g., more recent slots can be more interesting than older slots. Smoothness can be determined in both qualitative and quantitative terms. A comparison to other algorithms is conducted. The experimental results prove that the proposed solution is faster than other approaches but has a slightly higher cost in terms of memory.