Mining uncertain data for constrained frequent sets

  • Authors:
  • Carson Kai-Sang Leung;Dale A. Brajczuk

  • Affiliations:
  • The University of Manitoba, Winnipeg, MB, Canada;The University of Manitoba, Winnipeg, MB, Canada

  • Venue:
  • IDEAS '09 Proceedings of the 2009 International Database Engineering & Applications Symposium
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data mining aims to search for implicit, previously unknown, and potentially useful pieces of information---such as sets of items that are frequently co-occurring together---that are embedded in data. The mined frequent sets can be used in the discovery of correlation or casual relations, analysis of sequences, and formation of association rules. Since its introduction, frequent set mining has been the subject of numerous studies. Most of these studies find all the frequent sets from transaction databases of precise data, in which items within each transaction are definitely known and precise. However, there are many real-life situations in which the user is interested in only some tiny portions of the entire frequent sets, and there are also many situations in which data in the transaction databases are uncertain. This calls for both (i) constrained frequent set mining (which finds frequent sets that satisfy user constraints indicating the user interest) and (ii) frequent set mining from uncertain data. In this paper, we propose a tree-based system that integrates these two kinds of frequent set mining. The resulting mining system avoids candidate generation; it pushes the user constraints inside the mining process, which avoids unnecessary computation. Consequently, the system effectively mines from transaction databases of uncertain data for only those frequent sets satisfying the user-specified constraints.