Efficient mining maximum frequent pagesets with double dwell time constraint

  • Authors:
  • Ren Jiadong;Zhang Xiaojian;Thomas B. Hodel-Widmer

  • Affiliations:
  • College of Information Science and Engineering, YanShan University, Qinhuangdao City, Hebei Province, P.R. China;College of Information Science and Engineering, YanShan University, Qinhuangdao City, Hebei Province, P.R. China;International Relations and Security Network, Federal Institute of Technology (ETH Zurich), Zurich, Switzerland

  • Venue:
  • AIA'06 Proceedings of the 24th IASTED international conference on Artificial intelligence and applications
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Web usage mining is the application of data mining techniques to large web log database in order to discover frequent pagesets and usage patterns. However, most of the previous researches only focus on the whole database, besides it is unrealistic to mine the full set of frequent pagesets and patterns. So we give the double dwell time to constrain the database according to the decision-maker's (user's) mining purpose. Recent work has highlighted the importance of constraint-based Maximum Frequent Pagesets (MFP) mining, thus we design an efficient algorithm named Maximum Frequent PageSet Mining (MFPSM) for mining MFP. According to FP-tree, we present a data structure called Dwell Time Frequent Page tree (DTFP-tree) to store database of session. Using DTFP-tree, we can compress the scale of original FP-tree, and simplify the setup of time thresholds during mining. Our Experiments show that our algorithm can significantly reduce the runtime of mining as long as the decision-makers (users) give the appropriate dwell time constraints, and outperform other algorithms for mining MFP.