NDPMine: efficiently mining discriminative numerical features for pattern-based classification

  • Authors:
  • Hyungsul Kim;Sangkyum Kim;Tim Weninger;Jiawei Han;Tarek Abdelzaher

  • Affiliations:
  • Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL;Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL;Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL;Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL;Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL

  • Venue:
  • ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Pattern-based classification has demonstrated its power in recent studies, but because the cost of mining discriminative patterns as features in classification is very expensive, several efficient algorithms have been proposed to rectify this problem. These algorithms assume that feature values of the mined patterns are binary, i.e., a pattern either exists or not. In some problems, however, the number of times a pattern appears is more informative than whether a pattern appears or not. To resolve these deficiencies, we propose a mathematical programming method that directly mines discriminative patterns as numerical features for classification. We also propose a novel search space shrinking technique which addresses the inefficiencies in iterative pattern mining algorithms. Finally, we show that our method is an order of magnitude faster, significantly more memory efficient and more accurate than current approaches.