A Boolean algebraic framework for association and pattern mining

  • Authors:
  • Hatim A. Aboalsamh

  • Affiliations:
  • Department of Computer Sciences, King Saud University, Saudi Arabia

  • Venue:
  • ICCOMP'08 Proceedings of the 12th WSEAS international conference on Computers
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data mining and sequential mining analysis are considered as crucial components of strategic control over a broad variety of disciplines in business, science and engineering. Data mining has been defined as the non- trivial extraction of implicit, previously unknown and potentially useful information from data. Association mining is one of the important sub-fields in data mining, where rules that imply certain association relationships among a set of items in a transaction database are discovered. In Sequence mining, data are represented as sequences of events, where order of those events is important. Finding patterns in sequences is valuable for predicting future events. In many applications such as the WEB applications, stock market, and genetic analysis, finding patterns in a sequence of elements or events, helps in predicting what could be the next event or element. At the conceptual level, association mining and sequence mining are two similar processes but using different representations of data. In association mining, items are distinct and the order of items in a transaction is not important. While in sequential pattern mining, the order of elements (events) in transactions (sequences) is important, and the same event may occur more than once. In this paper, we propose a new mapping function that maps event sequences into itemsets. Based on the unified representation of the association mining and the sequential pattern, a new approach that uses the Boolean representation of input database D to build a Boolean matrix M. Boolean algebra operations are applied on M to generate all frequent itemsets. Finally, frequent items or frequent sequential patterns are represented by logical expressions that could be minimized by using a suitable logical function minimization technique.