An efficient algorithm for mining string databases under constraints

  • Authors:
  • Sau Dan Lee;Luc De Raedt

  • Affiliations:
  • Institute for Computer Science, University of Freiburg, Germany;Institute for Computer Science, University of Freiburg, Germany

  • Venue:
  • KDID'04 Proceedings of the Third international conference on Knowledge Discovery in Inductive Databases
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

We study the problem of mining substring patterns from string databases. Patterns are selected using a conjunction of monotonic and anti-monotonic predicates. Based on the earlier introduced version space tree data structure, a novel algorithm for discovering substring patterns is introduced. It has the nice property of requiring only one database scan, which makes it highly scalable and applicable in distributed environments, where the data are not necessarily stored in local memory or disk. The algorithm is experimentally compared to a previously introduced algorithm in the same setting.