Towards generic pattern mining

  • Authors:
  • Mohammed J. Zaki;Nagender Parimi;Nilanjana De;Feng Gao;Benjarath Phoophakdee;Joe Urban;Vineet Chaoji;Mohammad Al Hasan;Saeed Salem

  • Affiliations:
  • Computer Science Department, Rensselaer Polytechnic Institute, Troy, NY;Computer Science Department, Rensselaer Polytechnic Institute, Troy, NY;Computer Science Department, Rensselaer Polytechnic Institute, Troy, NY;Computer Science Department, Rensselaer Polytechnic Institute, Troy, NY;Computer Science Department, Rensselaer Polytechnic Institute, Troy, NY;Computer Science Department, Rensselaer Polytechnic Institute, Troy, NY;Computer Science Department, Rensselaer Polytechnic Institute, Troy, NY;Computer Science Department, Rensselaer Polytechnic Institute, Troy, NY;Computer Science Department, Rensselaer Polytechnic Institute, Troy, NY

  • Venue:
  • ICFCA'05 Proceedings of the Third international conference on Formal Concept Analysis
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Frequent Pattern Mining (FPM) is a very powerful paradigm for mining informative and useful patterns in massive, complex datasets. In this paper we propose the Data Mining Template Library, a collection of generic containers and algorithms for FPM, as well as persistency and database management classes. DMTL provides a systematic solution to a whole class of common FPM tasks like itemset, sequence, tree and graph mining. DMTL is extensible, scalable, and high-performance for rapid response on massive datasets. Our experiments show that DMTL is competitive with special purpose algorithms designed for a particular pattern type, especially as database sizes increase.