Generic pattern mining via data mining template library

  • Authors:
  • Mohammed J. Zaki;Nilanjana De;Feng Gao;Paolo Palmerini;Nagender Parimi;Jeevan Pathuri;Benjarath Phoophakdee;Joe Urban

  • Affiliations:
  • Computer Science Department, Rensselaer Polytechnic Institute, Troy, NY;Computer Science Department, Rensselaer Polytechnic Institute, Troy, NY;Computer Science Department, Rensselaer Polytechnic Institute, Troy, NY;Computer Science Department, Rensselaer Polytechnic Institute, Troy, NY;Computer Science Department, Rensselaer Polytechnic Institute, Troy, NY;Computer Science Department, Rensselaer Polytechnic Institute, Troy, NY;Computer Science Department, Rensselaer Polytechnic Institute, Troy, NY;Computer Science Department, Rensselaer Polytechnic Institute, Troy, NY

  • Venue:
  • Proceedings of the 2004 European conference on Constraint-Based Mining and Inductive Databases
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Frequent Pattern Mining (FPM) is a very powerful paradigm for mining informative and useful patterns in massive, complex datasets. In this paper we propose the Data Mining Template Library, a collection of generic containers and algorithms for data mining, as well as persistency and database management classes. DMTL provides a systematic solution to a whole class of common FPM tasks like itemset, sequence, tree and graph mining. DMTL is extensible, scalable, and high-performance for rapid response on massive datasets. A detailed set of experiments show that DMTL is competitive with special purpose algorithms designed for a particular pattern type, especially as database sizes increase.