Issues in pattern mining and their resolutions

  • Authors:
  • Tongyuan Wang;Bipin C. Desai

  • Affiliations:
  • Concordia University, Montreal, Canada;Concordia University, Montreal, Canada

  • Venue:
  • C3S2E '09 Proceedings of the 2nd Canadian Conference on Computer Science and Software Engineering
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Frequent pattern mining over large database is fundamental to many data mining applications. Various approaches have been proposed for pattern mining with respectable computational performance. However, our study has found some fundamental problems, such as overfitting and probability anomaly, which have not been well addressed. We believe that, analysing and resolving these problems would certainly improve the reliability and usefulness of mining approaches. This paper reports the first part of our study of these fundamental problems, how they are interrelated, and how they impact the correctness and reliability of pattern mining. We also present our proposal to reformulate the measure "support", to resolve the probability anomaly, and to quantify the overfitting degrees, followed by a brief introduction and summary of our proposal to resolve other problems under investigations.