Statistical supports for mining sequential patterns and improving the incremental update process on data streams

  • Authors:
  • Pierre-Alain Laur;Jean-Emile Symphor;Richard Nock;Pascal Poncelet

  • Affiliations:
  • Grimaag-Dépt Scientifique Interfacultaire, Université Antilles-Guyane, Campus de Schoelcher, B.P. 7209, 97275 Schoelcher Cedex, Martinique, France. E-mail: {palaur,je.symphor,rnock}@mart ...;Grimaag-Dépt Scientifique Interfacultaire, Université Antilles-Guyane, Campus de Schoelcher, B.P. 7209, 97275 Schoelcher Cedex, Martinique, France. E-mail: {palaur,je.symphor,rnock}@mart ...;Grimaag-Dépt Scientifique Interfacultaire, Université Antilles-Guyane, Campus de Schoelcher, B.P. 7209, 97275 Schoelcher Cedex, Martinique, France. E-mail: {palaur,je.symphor,rnock}@mart ...;{LG2IP}-Ecole des Mines d'Alès, Site EERIE, parc scientifique Georges Besse, 30035 Nîmes Cedex, France. E-mail: pascal.poncelet@ema.fr

  • Venue:
  • Intelligent Data Analysis - Knowlegde Discovery from Data Streams
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recently, the knowledge extraction community takes a closer look at new models where data arrive in timely manner like a fast and continuous flow, i.e. data streams. As only a part of the stream can be stored, mining data streams for sequential patterns and updating previously found frequent patterns need to cope with uncertainty. In this paper, we introduce a new statistical approach which biases the initial support for sequential patterns. This approach holds the advantage to maximize either the precision or the recall, as chosen by the user, and limit the degradation of the other criterion. Moreover, these statistical supports help building statistical borders which are the relevant sets of frequent patterns to use into an incremental mining process. From the statistical standpoint, theoretical results show that the technique is not far from the optimum. Experiments performed on sequential patterns demonstrate the interest of this approach and the potential of such techniques.