Interactive pattern mining on hidden data: a sampling-based solution

  • Authors:
  • Mansurul Bhuiyan;Snehasis Mukhopadhyay;Mohammad Al Hasan

  • Affiliations:
  • Indiana University - Purdue University, Indianapolis (IUPUI), Indianapolis, IN, USA;Indiana University - Purdue University, Indianapolis (IUPUI), Indianapolis, IN, USA;Indiana University - Purdue University, Indianapolis (IUPUI), Indianapolis, IN, USA

  • Venue:
  • Proceedings of the 21st ACM international conference on Information and knowledge management
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Mining frequent patterns from a hidden dataset is an important task with 43 various real-life applications. In this research, we propose a solution to this problem that is based on Markov Chain Monte Carlo (MCMC) sampling of frequent patterns. Instead of returning all the frequent patterns, the proposed paradigm returns a small set of randomly selected patterns so that the clandestinity of the dataset can be maintained. Our solution also allows interactive sampling, so that the sampled patterns can fulfill the user's requirement effectively. We show experimental results from several real life datasets to validate the capability and usefulness of our solution; in particular, we show examples that by using our proposed solution, an eCommerce marketplace can allow pattern mining on user session data without disclosing the data to the public; such a mining paradigm helps the sellers of the marketplace, which eventually boost the marketplace's own revenue.