Pushing constraints into data streams

  • Authors:
  • Andreia Silva;Cláudia Antunes

  • Affiliations:
  • Technical University of Lisbon, Lisbon, Portugal;Technical University of Lisbon, Lisbon, Portugal

  • Venue:
  • Proceedings of the 2nd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

One important challenge in data mining is the ability to deal with complex, voluminous and dynamic data. Indeed, due to the great advances in technology, in many real world applications data appear in the form of continuous data streams, as opposed to traditional static datasets. Several techniques have been proposed to explore data streams, in particular for the discovery of frequent co-occurrences in data. However, one of the common criticisms pointed out to frequent pattern mining is the fact that it generates a huge number of patterns, independent of user expertise, making it very hard to analyze and use the results. These bottlenecks are even more evident when dealing with data streams, since new data are continuously and endlessly arriving, and many intermediate results must be kept in memory. The use of constraints to filter the results is the most common and used approach to focus the discovery on what is really interesting. In this sense, there is a need for the integration of data stream mining with constrained mining. In this work we describe a set of strategies for pushing constraints into data stream mining, through the use of a pattern tree structure that captures a summary of the current possible patterns. We also propose an algorithm that discovers patterns in data streams that satisfy any user defined constraint.